= Subtitles / Untertitel [[software:voctoweb]] hat mehr aktuell zwei Varianten über die publizierte Vorträge Untertitel bekommen können: - die klassischen Workflows über die Django-App auf c3subtitles.de ([[https://github.com/c3subtitles/subtitleStatus|subtitleStatus]]) - per Whisper/[[https://github.com/bugbakery/transcribee|Transcribee]] über die [[https://publishing.c3voc.de/docs|Publishing API]] Vereinfacht gesagt geht es dabei immer um zwei Tasks: - das Publizieren der SRT/VTT Datei - das Anlegen der Metadaten (aka "Subtitle Recording") in voctoweb, damit die Untertitel auch im Player angezeigt werden == Publishing Gateway/API Um sich nicht mit den Eigenheiten der c3voc Infrastrktur herumschlagen zu müssen wurde 2023/2024 eine REST API eingerichtet die es erlaubt Zusatzdaten wie Slides, Subtitles etc. rein per REST zu publizieren, ohne das SFTP o.ä. verwendet werden muss: - https://publishing.c3voc.de/docs - https://github.com/voc/publishing-gw bzw. https://forgejo.c3voc.de/voc/publishing-gw Die dazu notwendigen API Keys sind entweder statisch, oder Bearer-Token aus https://sso.c3voc.de/ – mehr Details dazu im Git-Repo. == Transcribee Zum 38C3/39C3 kam https://github.com/bugbakery/transcribee-voctoweb-glue-2/wiki/How-to-create-subtitles hinzu: At the moment, the process looks like this: 1. The glue code creates an entry for every talk uploaded to the congress event on media.ccc.de (state: `new`) 2. A [transcribee](https://github.com/bugbakery/transcribee) document is automatically created for every talk (state: `preparing`) 3. transcribee creates a draft transcription 4. You go to https://subtitles.bugbakery.org and claim a talk with the state `needs correction` or `partially corrected` 5. You correct the draft in transcribee using the `Open Editor` button 6. When you want to stop correcting, you click `Finish work` and - enter the time up to which the transcript is corrected (MM:SS or HH:MM:SS) and click `Ok` - or choose `Until the end` when the correction of the whole talk is finished (state: done) 7. When the whole talk is corrected, you can hit `Publish`, which uploads the transcript to the talk on https://media.ccc.de. Please note, that it can take a couple of moments for the new version to be visible to you due to caching. If you want to fix a mistake in already published subtitles, you start with step 4 and claim a talk with the state `done`. Nächster Schritte: - die Software die aktuell auf https://subtitles.bugbakery.org/ läuft bei https://subtitles.c3voc.de/ und - ans SSO anzubinden - default mäßig alle neuen Talks auf media.ccc.de dort dann auch einlaufen lassen --- //[[andi@muc.ccc.de|Andi]] 2026/06/20// == Classic workflow via c3subtitles.de/Amara/etc. Achtung: Der Rest dieser Seite beschreibt den alten Workflow, der aktuell nicht mehr aktiv in Verwendung ist. Das Team hat hatte zwischenzeitlich eigene Infrastruktur unter https://wiki.c3subtitles.de die bei selfnet gehostet war – inzwischen ist diese VM aber leider aus unbekannten Gründen nicht mehr erreichbar. Components - [[https://github.com/voc/scripts/blob/master/subtitles/sync_media_recordings.py|sync_media_recordings.py]] uses CSV export from `subtitlesStatus` to publish subtiles in [[vocotoweb]] via REST API: * was running every 10 minutes via systemd timer [[intern:server:releasing.c3voc.de]]`:/etc/systemd/system/publish-subtitles.timer` - [subtitleStatus](https://github.com/c3subtitles/subtitleStatus) (Django-App @ c3subtitles.de) * Dashboard mit Übersicht des Transkriptionsstatus pro Konferenz und Vortrag * Workflow-Manager * … * schiebt fertige Untertitel-Dateien (SRT) per rsync auf mirror.selfnet.de {{::docs:architecture-overview-subtitles.png?600|}} === CSV-Export from C3Subtitles: https://c3subtitles.de/media_export/2020-12-30T0:00:00.99Z Example: ^ `GUID` ^ `complete` ^ `media_language` ^ `srt_language` ^ `last_changed_on_amara` ^ `revision` ^ `url` ^ `touched` ^ `amara_key` ^ `amara_language` ^ `state` ^ `amara_subtitle_url` ^ | db11e86c-ecf8-40c0-b8d8-2f6798507146 | False | eng | en | 1-01-01T00:00:00Z | 1 | https://mirror.selfnet.de/c3subtitles/events/rc3/rc3-mcr-11546-eng-deu-Measuring_radioactivity_using_low-cost_silicon_sensors.en.srt | 2021-01-03T12:30:45Z | P4gHpqpuJJIA | en | 7 | https://amara.org/api/videos/P4gHpqpuJJIA/languages/en/subtitles/ | | 221560a2-7470-4e90-9190-99a2bef53238 | False | deu | de | 1-01-01T00:00:00Z | 1 | https://mirror.selfnet.de/c3subtitles/events/rc3/rc3-mcr-11574-deu-Globalisierung_Digitalisierung_und_die_Wachstumsfrage.de.srt | 2021-01-06T10:37:48Z | dSkmesksKqIe | de | 2 | https://amara.org/api/videos/dSkmesksKqIe/languages/de/subtitles/ | | 6beabddc-2dd6-43d2-9936-618d41d42cde | True | deu | de | 1-01-01T00:00:00Z | 5 | https://mirror.selfnet.de/c3subtitles/congress/35c3/35c3-9744-deu-eng-Inside_the_Fake_Science_Factories.de.srt | 2021-01-01T23:14:02Z | vsf2PBryeqW9 | de | 8 | https://amara.org/api/videos/vsf2PBryeqW9/languages/de/subtitles/ | | 7bdf7688-8620-4170-93bf-3c2adfd30030 | False | deu | de | 1-01-01T00:00:00Z | 1 | https://mirror.selfnet.de/c3subtitles/congress/36c3/36c3-10983-deu-eng-Its_alive_-_Nach_den_Protesten_gegen_die_Polizeigesetze_ist_vor_den_Protesten_gegen_die_autoritaere_Wende.de.srt | 2020-12-30T10:58:21Z | NWHl9bK0MXF8 | de | 2 | https://amara.org/api/videos/NWHl9bK0MXF8/languages/de/subtitles/ | | 52ce1398-fa9b-4bd3-aa9e-6a49a764ac2c | True | deu | de | 1-01-01T00:00:00Z | 7 | https://mirror.selfnet.de/c3subtitles/congress/35c3/35c3-9343-deu-eng-Court_in_the_Akten.de.srt | 2021-01-04T22:17:06Z | 3mDTDpROSDxZ | de | 8 | https://amara.org/api/videos/3mDTDpROSDxZ/languages/de/subtitles/ | | 1cff41a8-455e-42a6-ab08-d6cb166e7d3b | False | deu | de | 1-01-01T00:00:00Z | 1 | https://mirror.selfnet.de/c3subtitles/congress/35c3/35c3-10036-deu-eng-Mondnacht.de.srt | 2021-01-04T22:12:21Z | gAbEiRm8Mocc | de | 2 | https://amara.org/api/videos/gAbEiRm8Mocc/languages/de/subtitles/ | To download the raw (draft) subtitles from Amara, append use `https://amara.org/api/videos/{amara_key}/languages/{amara_lanuage}/subtitles/?format=vtt` (compare https://apidocs.amara.org/#fetch-raw-subtitles) ==== States For voctoweb (media.ccc.de) only states 7, 8 and 12 are relevant. Subtitle files in all other states should be ignored. {{tablelayout?colwidth=""}} ^ ID ^ voctoweb ^ c3subtitles ^ additional information ^ | 1 | | Nothing available yet | irrelevant should not exist | | 2 | todo | Transcribed until | should exist | | 3 | | Transcript finished | might exist - still no timestamps | | 4 | | Please do not touch, work in progress | Autotiming in process no timestamps | | 5 | | Synced until | rare case of syncing by hand | | 6 | | Syncing finished | with timestamps, usable as draft | | 7 | draft | Quality control done until | with timestamps, usable as draft | | 8 | complete | Job completed | finished, obviously with timestamps and usable | | 9 | | Unknown | should not exist | | 11 | todo | Translated until | translation, not usable as draft | | 12 | translated | Translation is finished | finished, obviously with timestamps and usable |