Customs Creators Collective archive tool
Go to file
Joshua Boniface 95c6a6bac4 Use fuzzier matching for filters
Lowercase to avoid capitalization issues.
2023-04-07 02:02:41 -04:00
c3dbdl Use fuzzier matching for filters 2023-04-07 02:02:41 -04:00
.flake8 Add linting/formatting and convert to module 2023-04-06 19:33:52 -04:00
.gitignore Add linting/formatting and convert to module 2023-04-06 19:33:52 -04:00
LICENSE Add linting/formatting and convert to module 2023-04-06 19:33:52 -04:00
README.md Adjust default structure again 2023-04-06 20:14:24 -04:00
format Add linting/formatting and convert to module 2023-04-06 19:33:52 -04:00
lint Add linting/formatting and convert to module 2023-04-06 19:33:52 -04:00
requirements.txt Add c2dbdl script 2023-04-02 15:14:13 -04:00
setup.py Add linting/formatting and convert to module 2023-04-06 19:33:52 -04:00

README.md

Customs Creators Collective archive tool

The Customs Creators Collective archive tool allows for easy scraping to a local JSON database and downloading of files from the C3 (Customs Creators Collective) database, a collection of custom songs for Rock Band and similar clone games.

This tool exists because the C3DB is very hard to mass download from: each song must be found in the extensive list, selected manually, and a second link clicked through, before a random file name is obtained. This tool simplifies the process by first collecting information about all available songs of a particular type, and then is able to download songs based on customizable filters (e.g. by genre, artist, author, etc.) and output them in a standardized format.

To use the tool, first use the "database" command to build or modify your local JSON database, then use the "download" command to download songs.

Installation

pip

  1. Use pip3 install . to install the package to a virtualenv or your system Python. The tool will be available as c3dbdl in your shell.

Manual

  1. Install the Python3 requirements from requirements.txt.

  2. Copy the file c3dbdl/c3dbdl.py to somewhere in your $PATH. You can optionally remove the .py if you with for command compatibility with a pip installation.

Usage

Before running any command, use the built-in help via the -h/--help option to view the available option(s) of the command. This option is available everywhere by virtue of the Click tool, so use it frequently to get a comprehensive understanding of all available options and how they work.

The general process of using c3dbdl is as follows:

  1. Select a download location, and either specify it with the -d/--download-directory option or via the environment variable C3DBDL_DOWNLOAD_DIRECTORY.

  2. Select a base URL. Use this to determine what game(s) you want to want to limit to, or use the default to fetch all avilable songs for all games, and either specify it with the -u/--base-url option or via the environment variable C3DBDL_BASE_URL.

  3. Initialize your C3DB JSON database with c3dbdl [options] database build. This will take a fair amount of time to complete as all pages of the chosen base URL, and all song pages (30,000+) are scanned. Note that if you cancel this process, no data will be saved, so let it complete! The default concurrency setting should make this relatively quick but YMMV.

  4. Download any song(s) you want with c3dbdl [options] download [options].

Database & Included Data

The database is contained in a JSON document which lists all possible songs which were scraped from the C3DB pages during the database build step.

To obtain the database, first the specified base URL is downloaded to get a list of pages, and then each page is iterated through. Within each page, all "song" table entries are extracted for information, and the song page itself visited to obtain a full list of download links. The song iteration is performed in parallel with a default of 10 simultaneous jobs (configurable with -c/--concurrency) to speed up downloading.

Once all pages and songs have been scanned, the results are saved into the database file specified, which can then be reused for future downloads. Note that cancelling a database build before it is finished will result in an empty database and the process will have to be started again from the beginning.

A database file cannot be updated; it must be replaced wholesale. You can however interactively edit your local database with the database edit command should you choose to do so (for instance, to normalize album names or similar).

The contents of the database includes all information required for filtering and downloading as described below. An example entry (first entry on the first page) is:

{
  "artist": "Heatwave",
  "title": "Boogie Nights",
  "album": "Too Hot to Handle",
  "song_link": "https://db.c3universe.com/song/-34018",
  "genre": "Pop-Rock",
  "year": "1976",
  "length": "0:05",
  "author": "D97", 
  "dl_links": [
    {
      "link": "https://dl.c3universe.com/642d6ab2aa5b87.10964554",
      "description": "Rock Band 3 Xbox 360"
    }
  ]
}

The c3dbdl tool is very picky about the download links (dl_links) it selects. Specifically, it will only include links from c3universe.com, and not any other external "download sites" such as Mega.nz, Angelfire, etc.

This is done because the non-iteractive, command-based download method is not compatible with those sites, and we want this tool to be as automated as possible. Requiring some manual clickthrough of a web page would defeat the purpose here, and thus, we simply exclude them and require you download any such songs manually.

If a song ends up with no dl_links during scanning, for instance because they all pointed to such external "download sites", it will not be included in the database. Thus, the final number of songs in your database is guaranteed to be smaller than the total number listed on the C3DB website.

Downloading

Once a database has been built, you can start downloading songs.

By default, when downloading a given song, all possible download links (dl_links) will be downloaded; this can be limited by using the -i/--download-id and -d/--download-descr options to pick and choose specific files.

Once a song has been downloaded, assuming that the file structure doesn't change, subsequent download runs will not overwrite it and will simply skip downloading the file.

Filtering

Filtering out the songs in the database is a key part of this tool. You might want to be able to grab only select genres, artists, authors, etc. to make your custom song packs.

c3dbdl is able to filter by several key categories:

  • genre: The genre of the song.
  • artist: The artist of the song.
  • album: The album of the song.
  • title: The title of the song.
  • year: The year of the album/song.
  • author: The author of the file on C3DB.

Note that we cannot filter - mostly for parsing difficulty reasons - by intrument type or difficulty, by song length, or by any other information not mentioned above.

Filtering is always done during the download stage; the JSON database will always contain all possible entries.

To use filters, append one or more --filter options to your c3dbdl download command. A filter option begins with the literal --filter, followed by the category (e.g. genre or artist), then finally the text to filter on, for instance Rock or Santana or 2012. The text must be quoted if it contains whitespace.

If more that one filter is specified, they are treated as a logical AND, i.e. all the listed filters must apply to a given song for it to be downloaded in that run.

Filters allow very specific download selections to be run. For example, let's look for all songs by Rush from the album Vapor Trails (the remixed version) authored by ejthedj:

c3dbdl download --filter artist Rush --filter album "Vapor Trails [Remixed]" --author ejthedj
Found 19563 songs from JSON database file 'Downloads/c3db.json'
Downloading 1 song files...
> Downloading song "Rush - Sweet Miracle" by ejthedj...
Downloading file "Rock Band 3 Xbox 360" from https://dl.c3universe.com/s/ejthedj/sweetMiracle...
Successfully downloaded to ../Prog/ejthedj/Rush/Vapor Trails [Remixed]/Sweet Miracle [2002].sweetMiracle

In this case, one song matched and was downloaded. Feel free to experiment with the various filters to find exactly what you're looking for.

Output Format

When downloading files, it may be advantageous to customize the output directory and filename structure to better match what you plan to do with the files. For instance, for pure organiation you might want nicely laid out files with clear directory structures and names, while for Onyx packaging you might want everything in a flat directory.

c3dbdl provides complete flexibility in the output file format. When downloading, use the --file-structure option to set the file structure. This value is an interpolated string containing one or more field variables, which are mapped at download file. The available fields are:

  • genre: The genre of the song.
  • artist: The artist of the song.
  • album: The album of the song.
  • title: The title of the song.
  • year: The year of the album/song.
  • author: The author of the file on C3DB.
  • orig_name: The original filename that would be downloaded by e.g. a browser.

The default structure leverages most of these options to create an archive-ready structure as follows:

{artist}/{album}/{title}.{author}.{orig_name}

As an example, as shown in the previous section:

Rush/Vapor Trails [Remixed]/Sweet Miracle.ejthedj.sweetMiracle

The genre is excluded because in my experience it is a fairly useless metric and is often incorrectly set, so it gets in the way more often than not. You are free of course to add it in to your own custom structure. The date is excluded for similar reasons and because if you know the album, you know the date.

If any field is missing during download, it is replaced with "None".

Note that any parent director(ies) will be automatically created down the whole tree until the final filename.

Help

This is a quick and dirty tool I wrote to quickly grab collections of songs. I provide no guarantee of success when using this tool. If you have issues, please open an issue on this repository and provide full details of your problem.