Both sides previous revision Previous revision Next revision | Previous revision |
subtitles [2021/01/07 18:00] – andi | subtitles [2025/05/26 01:08] (current) – andi |
---|
= Subtitles / Untertitel | = Subtitles / Untertitel |
| |
| [[software:voctoweb]] hat mehr aktuell zwei Varianten über die publizierte Vorträge Untertitel bekommen können: |
| |
=== Architecture overview in context of media.ccc.de | - die klassischen Workflows über die Django-App auf c3subtitles.de ([[https://github.com/c3subtitles/subtitleStatus|subtitleStatus]]) |
| - per Whisper/[[https://github.com/bugbakery/transcribee|Transcribee]] über die [[https://publishing.c3voc.de/docs|Publishing API]] |
| |
{{::docs:architecture-overview-subtitles.png?600|}} | |
| |
<callout type="warning">Achtung: Der Rest dieser Seite wird ggf. nicht mehr aktiv gepflegt. Das Team hat inzwischen eigene Infrastruktur: https://wiki.c3subtitles.de</callout> | Vereinfacht gesagt geht es dabei immer um zwei Tasks: |
| |
* Sync-Skript von subtitlesStatus -> vocotoweb: https://github.com/voc/scripts/blob/master/subtitles/sync_media_recordings.py | - das Publizieren der SRT/VTT Datei |
* systemd Timer auf [[intern:server:releasing.c3voc.de]] | - das Anlegen der Metadaten (aka "Subtitle Recording") in voctoweb, damit die Untertitel auch im Player angezeigt werden |
* https://github.com/c3subtitles/subtitleStatus | |
* Dashboard mit Übersicht des Transkriptionsstatus pro Konferenz und Vortrag | |
* Workflow-Manager | |
* … | |
* schiebt fertige Untertitel-Dateien (SRT) per rsync auf mirror.selfnet.de | |
| |
== CSV-Export from C3Subtitles: | |
| |
https://c3subtitles.de/media_export/2020-12-30T0:00:00.99Z | == Publishing Gateway/API |
| |
| Um sich nicht mit den Eigenheiten der c3voc Infrastrktur herumschlagen zu müssen wurde 2023/2024 eine REST API eingerichtet die es erlaubt Zusatzdaten wie Slides, Subtitles etc. rein per REST zu publizieren, ohne das SFTP o.ä. verwendet werden muss: |
| |
Example: | - https://publishing.c3voc.de/docs |
``` | - https://github.com/voc/publishing-gw |
GUID;complete;media_language;srt_language;last_changed_on_amara;revision;url;touched;amara_key;amara_language;state;amara_subtitle_url | |
db11e86c-ecf8-40c0-b8d8-2f6798507146;False;eng;en;1-01-01T00:00:00Z;1;https://mirror.selfnet.de/c3subtitles/events/rc3/rc3-mcr-11546-eng-deu-Measuring_radioactivity_using_low-cost_silicon_sensors.en.srt;2021-01-03T12:30:45Z;P4gHpqpuJJIA;en;2;https://amara.org/api/videos/P4gHpqpuJJIA/languages/en/subtitles/ | |
221560a2-7470-4e90-9190-99a2bef53238;False;deu;de;1-01-01T00:00:00Z;1;https://mirror.selfnet.de/c3subtitles/events/rc3/rc3-mcr-11574-deu-Globalisierung_Digitalisierung_und_die_Wachstumsfrage.de.srt;2021-01-06T10:37:48Z;dSkmesksKqIe;de;2;https://amara.org/api/videos/dSkmesksKqIe/languages/de/subtitles/ | |
6beabddc-2dd6-43d2-9936-618d41d42cde;True;deu;de;1-01-01T00:00:00Z;5;https://mirror.selfnet.de/c3subtitles/congress/35c3/35c3-9744-deu-eng-Inside_the_Fake_Science_Factories.de.srt;2021-01-01T23:14:02Z;vsf2PBryeqW9;de;8;https://amara.org/api/videos/vsf2PBryeqW9/languages/de/subtitles/ | |
7bdf7688-8620-4170-93bf-3c2adfd30030;False;deu;de;1-01-01T00:00:00Z;1;https://mirror.selfnet.de/c3subtitles/congress/36c3/36c3-10983-deu-eng-Its_alive_-_Nach_den_Protesten_gegen_die_Polizeigesetze_ist_vor_den_Protesten_gegen_die_autoritaere_Wende.de.srt;2020-12-30T10:58:21Z;NWHl9bK0MXF8;de;2;https://amara.org/api/videos/NWHl9bK0MXF8/languages/de/subtitles/ | |
52ce1398-fa9b-4bd3-aa9e-6a49a764ac2c;True;deu;de;1-01-01T00:00:00Z;7;https://mirror.selfnet.de/c3subtitles/congress/35c3/35c3-9343-deu-eng-Court_in_the_Akten.de.srt;2021-01-04T22:17:06Z;3mDTDpROSDxZ;de;8;https://amara.org/api/videos/3mDTDpROSDxZ/languages/de/subtitles/ | |
1cff41a8-455e-42a6-ab08-d6cb166e7d3b;False;deu;de;1-01-01T00:00:00Z;1;https://mirror.selfnet.de/c3subtitles/congress/35c3/35c3-10036-deu-eng-Mondnacht.de.srt;2021-01-04T22:12:21Z;gAbEiRm8Mocc;de;2;https://amara.org/api/videos/gAbEiRm8Mocc/languages/de/subtitles/ | |
``` | |
| |
States | Die dazu notwendigen API Keys sind aktuell noch statisch konfiguriert, sollen aber perspektivisch an https://sso.c3voc.de/ angebunden werden. |
| |
^ ID ^ c3subtitles ^ additional information ^ | |
| 1 | Nothing available yet | irrelevant should not exist | | |
| 2 | Transcribed until | should exist | | |
| 3 | Transcript finished | might exist - still no timestamps | | |
| 4 | Please do not touch, work in progress | Autotiming in process no timestamps | | |
| 5 | Synced until | rare case of syncing by hand | | |
| 6 | Syncing finished | with timestamps, usable as draft | | |
| 7 | Quality control done until | with timestamps, usable as draft | | |
| 8 | Job completed | finished, obviously with timestamps and usable | | |
| 9 | Unknown | should not exist | | |
| 11 | Translated until | translation, not usable as draft | | |
| 12 | Translation is finished | finished, obviously with timestamps and usable | | |
| |
| == Classic workflows |
| |
== Communication | Components |
| |
* Web: [[https://c3subtitles.de/]] | - [[https://github.com/voc/scripts/blob/master/subtitles/sync_media_recordings.py|sync_media_recordings.py]] uses CSV export from `subtitlesStatus` to publish subtiles in [[vocotoweb]] via REST API: |
* Twitter: www.twitter.com/c3subtitles | * is run every 10 minutes via systemd timer [[intern:server:releasing.c3voc.de]]`:/etc/systemd/system/publish-subtitles.timer` |
* Mailinglist: subtitles-angels -at- lists.selfnet.de | |
* Mailinglist: subtitles -at- lists.ccc.de | |
* IRC: [[irc://irc.hackint.org:9999/#subtitles|#subtitles]] auf hackint. **Requires SSL**. - but also the #voc channel | |
* Etherpad-Domain: [[https://subtitles.pads.ccc.de]] | |
* Jabber: c3subtitles -!-at-!- jabber.ccc.de | |
* Videos on amara.org : [[http://www.amara.org/en/profiles/videos/c3subtitles/|c3subtitles videos on amara.org]] | |
* E-Mail: subtitles -!-at-!- c3voc.de | |
| |
== What is our goal? | - [subtitleStatus](https://github.com/c3subtitles/subtitleStatus) (Django-App @ c3subtitles.de) |
| * Dashboard mit Übersicht des Transkriptionsstatus pro Konferenz und Vortrag |
| * Workflow-Manager |
| * … |
| * schiebt fertige Untertitel-Dateien (SRT) per rsync auf mirror.selfnet.de |
| |
Better and more barrierfree access to the live talks and streams and to the videos afterwards via subtitles. Especially for non-natives of the spoken languages and for deaf and hard of hearing listeners. | |
| |
Nice side effect: finished subtitles are pretty easy to translate in any other language, [[http://www.amara.org/en/profiles/videos/c3subtitles/|amara.org]] also provides a very easy usable interface for that purpose. | {{::docs:architecture-overview-subtitles.png?600|}} |
| |
| |
== How can I help? | |
| |
* If you visit the congress and are a user of a speech recognition software, please contact us! Also if you are a computer stenography writer or a good touch typist. | === CSV-Export from C3Subtitles: |
* If you are interested in what we are working on behind the scenes, just contact us! | |
* Help us creating the subtitles via amara.org - you do not even have to visit the congress to do that! Everybody from at home can do that! | |
| |
== What are our current projects behind the scenes? | https://c3subtitles.de/media_export/2020-12-30T0:00:00.99Z |
| |
* Devoloping software for a user interface to choose which subtitle you want to work on depending on your favorite task | Example: |
* Developing software for subtitles via computer stenography or speech recognition, visible live in the talk via webstream and later as start for the precise version to work on in [[http://www.amara.org/de/profiles/profile/162037/|amara.org]] | |
* Developing a phonetic german steno keyboard layout | |
* Building a steno keyboard | |
* Using an old mechanical stenographer with a micro controller to detect the pressed keys as steno input | |
| |
| ^ `GUID` ^ `complete` ^ `media_language` ^ `srt_language` ^ `last_changed_on_amara` ^ `revision` ^ `url` ^ `touched` ^ `amara_key` ^ `amara_language` ^ `state` ^ `amara_subtitle_url` ^ |
| | db11e86c-ecf8-40c0-b8d8-2f6798507146 | False | eng | en | 1-01-01T00:00:00Z | 1 | https://mirror.selfnet.de/c3subtitles/events/rc3/rc3-mcr-11546-eng-deu-Measuring_radioactivity_using_low-cost_silicon_sensors.en.srt | 2021-01-03T12:30:45Z | P4gHpqpuJJIA | en | 7 | https://amara.org/api/videos/P4gHpqpuJJIA/languages/en/subtitles/ | |
| | 221560a2-7470-4e90-9190-99a2bef53238 | False | deu | de | 1-01-01T00:00:00Z | 1 | https://mirror.selfnet.de/c3subtitles/events/rc3/rc3-mcr-11574-deu-Globalisierung_Digitalisierung_und_die_Wachstumsfrage.de.srt | 2021-01-06T10:37:48Z | dSkmesksKqIe | de | 2 | https://amara.org/api/videos/dSkmesksKqIe/languages/de/subtitles/ | |
| | 6beabddc-2dd6-43d2-9936-618d41d42cde | True | deu | de | 1-01-01T00:00:00Z | 5 | https://mirror.selfnet.de/c3subtitles/congress/35c3/35c3-9744-deu-eng-Inside_the_Fake_Science_Factories.de.srt | 2021-01-01T23:14:02Z | vsf2PBryeqW9 | de | 8 | https://amara.org/api/videos/vsf2PBryeqW9/languages/de/subtitles/ | |
| | 7bdf7688-8620-4170-93bf-3c2adfd30030 | False | deu | de | 1-01-01T00:00:00Z | 1 | https://mirror.selfnet.de/c3subtitles/congress/36c3/36c3-10983-deu-eng-Its_alive_-_Nach_den_Protesten_gegen_die_Polizeigesetze_ist_vor_den_Protesten_gegen_die_autoritaere_Wende.de.srt | 2020-12-30T10:58:21Z | NWHl9bK0MXF8 | de | 2 | https://amara.org/api/videos/NWHl9bK0MXF8/languages/de/subtitles/ | |
| | 52ce1398-fa9b-4bd3-aa9e-6a49a764ac2c | True | deu | de | 1-01-01T00:00:00Z | 7 | https://mirror.selfnet.de/c3subtitles/congress/35c3/35c3-9343-deu-eng-Court_in_the_Akten.de.srt | 2021-01-04T22:17:06Z | 3mDTDpROSDxZ | de | 8 | https://amara.org/api/videos/3mDTDpROSDxZ/languages/de/subtitles/ | |
| | 1cff41a8-455e-42a6-ab08-d6cb166e7d3b | False | deu | de | 1-01-01T00:00:00Z | 1 | https://mirror.selfnet.de/c3subtitles/congress/35c3/35c3-10036-deu-eng-Mondnacht.de.srt | 2021-01-04T22:12:21Z | gAbEiRm8Mocc | de | 2 | https://amara.org/api/videos/gAbEiRm8Mocc/languages/de/subtitles/ | |
| |
| |
| To download the raw (draft) subtitles from Amara, append use `https://amara.org/api/videos/{amara_key}/languages/{amara_lanuage}/subtitles/?format=vtt` (compare https://apidocs.amara.org/#fetch-raw-subtitles) |
| |
| |
| ==== States |
| |
| For voctoweb (media.ccc.de) only states 7, 8 and 12 are relevant. Subtitle files in all other states should be ignored. |
| {{tablelayout?colwidth=""}} |
| ^ ID ^ voctoweb ^ c3subtitles ^ additional information ^ |
| | 1 | | Nothing available yet | irrelevant should not exist | |
| | 2 | todo | Transcribed until | should exist | |
| | 3 | | Transcript finished | might exist - still no timestamps | |
| | 4 | | Please do not touch, work in progress | Autotiming in process no timestamps | |
| | 5 | | Synced until | rare case of syncing by hand | |
| | 6 | | Syncing finished | with timestamps, usable as draft | |
| | 7 | draft | Quality control done until | with timestamps, usable as draft | |
| | 8 | complete | Job completed | finished, obviously with timestamps and usable | |
| | 9 | | Unknown | should not exist | |
| | 11 | todo | Translated until | translation, not usable as draft | |
| | 12 | translated | Translation is finished | finished, obviously with timestamps and usable | |
| |
| |
| |
| == Communication |
| |
| <callout type="warning">Achtung: Der Rest dieser Seite wird ggf. nicht mehr aktiv gepflegt. Das Team hat inzwischen eigene Infrastruktur: https://wiki.c3subtitles.de</callout> |
| |
| |
[[intern:subtitles|Intern]] | |