From 60d7f05961845ca8389a30de730d6728801b1303 Mon Sep 17 00:00:00 2001 From: MC42 Date: Tue, 19 Nov 2024 15:53:31 -0500 Subject: [PATCH 1/4] Update README.md to better reflect current components. --- README.md | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 84e801b..9e5d3e8 100644 --- a/README.md +++ b/README.md @@ -7,20 +7,34 @@ as part of [Desert Bus For Hope](https://desertbus.org). A full design doc can be read at [initial-design-doc.pdf](./initial-design-doc.pdf), but a brief overview of the components: +#### Shared Components + +* `common` provides code shared between the other components. + +#### Ingest * `downloader` grabs segments from twitch and saves them to disk * `restreamer` serves segments from disk as well as playlist files allowing them to be streamed * `backfiller` queries restreamers of other servers in order to pick up segments this server doesn't have already, ie. it replicates missing segments. -* `cutter` interacts with a database to perform cutting jobs +* `chat_archiver` records twitch chat messages and merges them with records from other nodes. + +#### Processing * `sheetsync` syncs specifc database columns to a google doc which is the primary operator interface. +* `cutter` interacts with a database to perform cutting jobs * `thrimshim` acts as an interface between the `thrimbletrimmer` editor and the database. * `thrimbletrimmer` is a browser based video editor. * `segment_coverage` regularly checks whether there is complete segment coverage for each hour. * `playlist_manager` adds videos to youtube playlists depending on tags. -* `chat_archiver` records twitch chat messages and merges them with records from other nodes. -* `database` hosts a Postgres database to store events to be edited. + +#### Analysis +* `bus_analyzer` does OCR on the stream to produce timestamps of/for progress through the route. +* `buscribe` is back-end audio speech-to-text extraction. +* `buscribe_api` is the API provided for below to access the stored results of transcription. +* `buscribe-web` is the web frontend to interface with `buscribe`'s output (via `buscribe_api`) + +#### Services +* `postgres` hosts a Postgres database to store events to be edited. * `nginx` provides a webserver through which the other components are exposed to the outside world. -* `common` provides code shared between the other components. * `monitoring` provides dashboards to allow the wubloader to be monitored. ### Usage From 076c4663906b5b4bcd0b9ff2d2fbdb62bd50b754 Mon Sep 17 00:00:00 2001 From: MC42 Date: Tue, 19 Nov 2024 16:04:25 -0500 Subject: [PATCH 2/4] Small tidyup of config+install. --- INSTALL.md | 4 ++-- README.md | 10 ++++++++-- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/INSTALL.md b/INSTALL.md index 288d8b2..17bf10e 100644 --- a/INSTALL.md +++ b/INSTALL.md @@ -51,7 +51,7 @@ By default the `downloader`, `restreamer`, `backfiller`, `cutter`, `thrimshim`, If you are running a `cutter` you will have to place the appropriate Google credentials in a JSON file given by the `cutter_creds_file`. Likewise, if you are running the `sheetsync` service, you will have to place the appropriate credentials in the JSON file pointed to by `sheetsync_creds_file` as well as set the appropriate `sheet_id` and `worksheets` for the Google sheet to sync with. You will also need to set the appropriate `edit_url` to access `thrimbletrimmer`. -## Running the wubloader +## Running Wubloader To start the wubloader, simply run @@ -61,7 +61,7 @@ To stop the wubloader and clean up, simply run `docker-compose down` -## Database setup +## Database Setup When setting up a database node, a number of database specific options can be set. diff --git a/README.md b/README.md index 9e5d3e8..fed5370 100644 --- a/README.md +++ b/README.md @@ -37,7 +37,7 @@ but a brief overview of the components: * `nginx` provides a webserver through which the other components are exposed to the outside world. * `monitoring` provides dashboards to allow the wubloader to be monitored. -### Usage +### Installation All components are built as docker images. Components which access the disk expect a shared directory mounted at `/mnt`. @@ -46,8 +46,14 @@ A docker-compose file is provided to run all components. See `docker-compose.jso to set configuration options, then generate the compose file with `./generate-docker-compose`. Then run `docker-compose up`. +To install wubloader, please refer to [INSTALL.md](./INSTALL.md) for more granular steps. + +#### Alternate Setups + +> [!WARNING] +> Here be dragons. This config is largely maintained by one user and is not kept at parity with the rest of the tooling. It's highly suggested to not use at this juncture for new developers to wubloader. + There is also a kubernetes-based option, but it is less configurable and only fully supports replication and editing nodes. Basic support for running the database and playlist_manager has been added, but not tested. See [k8s.jsonnet](./k8s.jsonnet) for details. -Further details of installing and configuring the backfiller are provided in [INSTALL.md](./INSTALL.md). From 3f23164cec3d661ecefbd1217c964ef8d315f49f Mon Sep 17 00:00:00 2001 From: MC42 Date: Tue, 19 Nov 2024 16:07:31 -0500 Subject: [PATCH 3/4] Remove suggestion to clone repo via ZIP download. --- INSTALL.md | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/INSTALL.md b/INSTALL.md index 17bf10e..7ab8424 100644 --- a/INSTALL.md +++ b/INSTALL.md @@ -19,15 +19,10 @@ This installation guide is written assuming you are on a Linux-like operating sy ## Download the wubloader -You can download the latest version of the wubloader from github: - - https://github.com/dbvideostriketeam/wubloader/archive/master.zip - -Alternatively if you have `git` installed you can clone the git repository: +Use github to clone the repository into a working directory. If you intend to make changes, it's highly suggested you clone it into a fork of the repo on your own GitHub account. `git clone https://github.com/dbvideostriketeam/wubloader` - ## Generate the docker-compose file You can edit the `docker-compose.jsonnet` file to set the configuration options. Important options include: From 6496f86b114c57f31409307a4c9c2bd39cb4b603 Mon Sep 17 00:00:00 2001 From: MC42 Date: Tue, 19 Nov 2024 16:11:38 -0500 Subject: [PATCH 4/4] Call out default google credential file name and also add to gitignore for both variants listed in config. --- .gitignore | 2 ++ INSTALL.md | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/.gitignore b/.gitignore index abaee13..44197ba 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,5 @@ .*.uptodate *.egg-info build +google_creds.json # This is because the default credential location is noted as such. +creds.json # Listed as credentials in schedulebot reference jsonnet config. \ No newline at end of file diff --git a/INSTALL.md b/INSTALL.md index 7ab8424..7d0e7d8 100644 --- a/INSTALL.md +++ b/INSTALL.md @@ -44,7 +44,7 @@ After making any changes to `docker-compose.jsonnet`, you will need to rerun `ge By default the `downloader`, `restreamer`, `backfiller`, `cutter`, `thrimshim`, `segment_coverage` and `nginx` services of the wubloader will be run. To change which services are run edit the `enabled` object in `docker-compose.jsonnet`. A complete wubloader set up also requires one and only one `database` service (though having a backup database is a good idea), one and only one `sheetsync` service and one and only one `playlist_manager` service. -If you are running a `cutter` you will have to place the appropriate Google credentials in a JSON file given by the `cutter_creds_file`. Likewise, if you are running the `sheetsync` service, you will have to place the appropriate credentials in the JSON file pointed to by `sheetsync_creds_file` as well as set the appropriate `sheet_id` and `worksheets` for the Google sheet to sync with. You will also need to set the appropriate `edit_url` to access `thrimbletrimmer`. +If you are running a `cutter` you will have to place the appropriate Google credentials in a JSON file given by the `cutter_creds_file` (google_creds.json in the root repo by default). Likewise, if you are running the `sheetsync` service, you will have to place the appropriate credentials in the JSON file pointed to by `sheetsync_creds_file` as well as set the appropriate `sheet_id` and `worksheets` for the Google sheet to sync with. You will also need to set the appropriate `edit_url` to access `thrimbletrimmer`. ## Running Wubloader