Add c2dbdl script
This commit is contained in:
commit
cdc67eb114
|
@ -0,0 +1,128 @@
|
||||||
|
# C3DB Download Tool
|
||||||
|
|
||||||
|
The C3DB Download Tool allows for easy scraping to a local JSON database and downloading of files from the C3
|
||||||
|
(Customs Creators Collective) database, a collection of custom songs for Guitar Hero, Rock Band, and similar clone
|
||||||
|
games.
|
||||||
|
|
||||||
|
This tool exists because the C3DB is very hard to mass download from: each song must be found in the extensive
|
||||||
|
list, selected manually, and a second link clicked through, before a random file name is obtained. This tool
|
||||||
|
simplifies the process by first collecting information about all available songs of a particular type, and then is
|
||||||
|
able to download songs based on customizable filters (e.g. by genre, artist, author, etc.) and output them in a
|
||||||
|
standardized format.
|
||||||
|
|
||||||
|
To use the tool, first use the "database" command to build or modify your local JSON database, then use the
|
||||||
|
"download" command to download songs.
|
||||||
|
|
||||||
|
To avoid overloading or abusing the C3DB website, this tool operates exclusively in sequential mode by design; at
|
||||||
|
most one page is scraped (for "database build") or song downloaded (for "download") at once. Additionally, the tool
|
||||||
|
design ensures that the JSON database of songs is stored locally, so it only needs to be built once and then is
|
||||||
|
reused to perform actual downloads without putting further load on the website.
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
1. Install the Python3 requirements from `requirements.txt`.
|
||||||
|
|
||||||
|
1. Copy the script to a virtualenv, somewhere in your $PATH or execute directly from this folder (see Usage below).
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
Before running a command, use the build-in help via the `-h`/`--help` option to view the available option(s) of
|
||||||
|
the command.
|
||||||
|
|
||||||
|
The general process of using `c3dbdl` is as follows:
|
||||||
|
|
||||||
|
1. Select a download location, and either specify it with the `-d`/`--download-directory` option or via the
|
||||||
|
environment variable `C3DBDL_DOWNLOAD_DIRECTORY`.
|
||||||
|
|
||||||
|
1. Select a base URL. Use this to determine what game(s) you want to want to limit to, or use the default to
|
||||||
|
fetch all avilable songs for all games, and either specify it with the `-u`/`--base-url` option or via the
|
||||||
|
environment variable `C3DBDL_BASE_URL`.
|
||||||
|
|
||||||
|
1. Initialize your C3DB JSON database with `c3dbdl [options] database build`. This will take a fair amount
|
||||||
|
of time to complete as all pages of the chosen base URL are scanned. Note that if you cancel this process, no
|
||||||
|
data will be saved, so let it complete!
|
||||||
|
|
||||||
|
1. Download any song(s) you want with `c3dbdl [options] download [options]`.
|
||||||
|
|
||||||
|
## Filtering
|
||||||
|
|
||||||
|
Filtering out the songs in the database is a key part of this tool. You might want to be able to grab only select
|
||||||
|
genres, artists, authors, etc. to make your custom song packs.
|
||||||
|
|
||||||
|
`c3dbdl` is able to filter by several key categories:
|
||||||
|
|
||||||
|
* `genre`: The genre of the song.
|
||||||
|
* `artist`: The artist of the song.
|
||||||
|
* `album`: The album of the song.
|
||||||
|
* `title`: The title of the song.
|
||||||
|
* `year`: The year of the album/song.
|
||||||
|
* `author`: The author of the file on C3DB.
|
||||||
|
|
||||||
|
Note that we *cannot* filter - mostly for parsing difficulty reasons - by intrument type or difficulty, by song
|
||||||
|
length, or by any other information not mentioned above.
|
||||||
|
|
||||||
|
Filtering is always done during the download stage; the JSON database will always contain all possible entries.
|
||||||
|
|
||||||
|
To use filters, append one or more `--filter` options to your `c3dbdl download` command. A filter option begins
|
||||||
|
with the literal `--filter`, followed by the category (e.g. `genre` or `artist`), then finally the text to filter
|
||||||
|
on, for instance `Rock` or `Santana` or `2012`. The text must be quoted if it contains whitespace.
|
||||||
|
|
||||||
|
If more that one filter is specified, they are treated as a logical AND, i.e. all the listed filters must apply to
|
||||||
|
a given song for it to be downloaded in that run.
|
||||||
|
|
||||||
|
Filters allow powerfully specific download selections to be run. For example, let's look for all songs by Rush
|
||||||
|
from the album Vapor Trails (the remixed version) authored by ejthedj:
|
||||||
|
|
||||||
|
```
|
||||||
|
c3dbdl download --filter artist Rush --filter album "Vapor Trails [Remixed]" --author ejthedj
|
||||||
|
```
|
||||||
|
|
||||||
|
This shouldfind , as of 2023-04-02, exactly one song, "Sweet Miracle":
|
||||||
|
|
||||||
|
```
|
||||||
|
Found 28942 songs from JSON database file 'Downloads/c3db.json'
|
||||||
|
Downloading 1 song files...
|
||||||
|
Downloading song "Rush - Sweet Miracle" by ejthedj...
|
||||||
|
Downloading from https://dl.c3universe.com/s/ejthedj/sweetMiracle...
|
||||||
|
```
|
||||||
|
|
||||||
|
Feel free to experiment.
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
When downloading files, it may be advantageous to customize the output directory and filename structure to better
|
||||||
|
match what you plan to do with the files. For instance, for pure organiation you might want nicely laid out
|
||||||
|
files with clear directory structures and names, while for Onyx packaging you might want everything in a flat
|
||||||
|
directory.
|
||||||
|
|
||||||
|
`c3dbdl` provides complete flexibility in the output file format. When downloading, use the `--file-structure`
|
||||||
|
option to set the file structure. This value is an interpolated string containing one or more field variables,
|
||||||
|
which are mapped at download file. The available fields are:
|
||||||
|
|
||||||
|
* `genre`: The genre of the song.
|
||||||
|
* `artist`: The artist of the song.
|
||||||
|
* `album`: The album of the song.
|
||||||
|
* `title`: The title of the song.
|
||||||
|
* `year`: The year of the album/song.
|
||||||
|
* `author`: The author of the file on C3DB.
|
||||||
|
* `orig_file`: The original filename that would be downloaded by e.g. a browser.
|
||||||
|
|
||||||
|
The default structure leverages all of these options to create an archive-ready structure as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
{genre}/{artist}/{album}/{title} [{year}] ({author}).{orig_file}
|
||||||
|
```
|
||||||
|
|
||||||
|
As an example:
|
||||||
|
|
||||||
|
```
|
||||||
|
Prog/Rush/Vapor Trails [Remixed]/Sweet Miracle [2002] (ejthedj).sweetMiracle
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that any parent director(ies) will be automatically created down the whole tree until the final filename.
|
||||||
|
|
||||||
|
## Help
|
||||||
|
|
||||||
|
This is a quick and dirty tool I wrote to quickly grab collections of songs. I provide no guarantee of success
|
||||||
|
when using this tool. If you have issues, please open an issue on this repository and provide *full details*
|
||||||
|
of your problem.
|
|
@ -0,0 +1,404 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
|
||||||
|
import click
|
||||||
|
import requests
|
||||||
|
import re
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
from time import sleep
|
||||||
|
from difflib import unified_diff
|
||||||
|
from colorama import Fore
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
from urllib.error import HTTPError
|
||||||
|
|
||||||
|
CONTEXT_SETTINGS = dict(help_option_names=['-h', '--help'], max_content_width=120)
|
||||||
|
|
||||||
|
|
||||||
|
def buildDatabase(pages=None):
|
||||||
|
found_songs = []
|
||||||
|
|
||||||
|
if pages is None:
|
||||||
|
r = requests.get(f"{config['base_songs_url']}")
|
||||||
|
if r.status_code != 200:
|
||||||
|
return
|
||||||
|
|
||||||
|
root_page_html = BeautifulSoup(r.text, 'html.parser')
|
||||||
|
pages = int(root_page_html.body.find('a', attrs={'class':'paginationLastPage'}).get('href').replace('?page=', ''))
|
||||||
|
|
||||||
|
click.echo(f"Collecting data from {pages} pages")
|
||||||
|
|
||||||
|
# Get a list of song URIs
|
||||||
|
for i in range(1, pages + 1):
|
||||||
|
attempts = 1
|
||||||
|
p = None
|
||||||
|
while attempts <= 5:
|
||||||
|
try:
|
||||||
|
click.echo(f"Parsing page {i} (attempt #{attempts})...")
|
||||||
|
p = requests.get(f"{config['base_songs_url']}?page={i}")
|
||||||
|
break
|
||||||
|
except Exception:
|
||||||
|
sleep(attempts)
|
||||||
|
attempts += 1
|
||||||
|
if p is None or p.status_code != 200:
|
||||||
|
break
|
||||||
|
|
||||||
|
parsed_html = BeautifulSoup(p.text, 'html.parser')
|
||||||
|
|
||||||
|
table_html = parsed_html.body.find('div', attrs={'class':'portlet-body'}).find('tbody')
|
||||||
|
|
||||||
|
for entry in table_html.find_all('tr', attrs={'class':'odd'}):
|
||||||
|
if len(entry) < 1:
|
||||||
|
break
|
||||||
|
|
||||||
|
song_entry = dict()
|
||||||
|
|
||||||
|
for idx, td in enumerate(entry.find_all('td')):
|
||||||
|
if idx == 1:
|
||||||
|
# Download link
|
||||||
|
song_entry["dl_link"] = td.find('a', attrs={'target':'_blank'}).get('href')
|
||||||
|
elif idx == 2:
|
||||||
|
# Artist
|
||||||
|
song_entry["artist"] = td.find('a').get_text().strip().replace('/', '+')
|
||||||
|
elif idx == 3:
|
||||||
|
# Song
|
||||||
|
song_entry["title"] = td.find('div', attrs={'class':'c3ttitlemargin'}).get_text().strip().replace('/', '+')
|
||||||
|
song_entry["album"] = td.find('div', attrs={'class':'c3tartist'}).get_text().strip().replace('/', '+')
|
||||||
|
elif idx == 4:
|
||||||
|
# Genre
|
||||||
|
song_entry["genre"] = td.find('a').get_text().strip()
|
||||||
|
elif idx == 5:
|
||||||
|
# Year
|
||||||
|
song_entry["year"] = td.find('a').get_text().strip()
|
||||||
|
elif idx == 6:
|
||||||
|
# Length
|
||||||
|
song_entry["length"] = td.find('a').get_text().strip()
|
||||||
|
elif idx == 8:
|
||||||
|
# Author (of chart)
|
||||||
|
song_entry["author"] = td.find('a').get_text().strip().replace('/', '+')
|
||||||
|
|
||||||
|
if song_entry and song_entry['title']:
|
||||||
|
click.echo(f"Found song entry for {song_entry['artist']} - {song_entry['title']} by {song_entry['author']}")
|
||||||
|
found_songs.append(song_entry)
|
||||||
|
|
||||||
|
return found_songs
|
||||||
|
|
||||||
|
|
||||||
|
def downloadSong(destination, filename, entry):
|
||||||
|
click.echo(f"""Downloading song "{entry['artist']} - {entry['title']}" by {entry['author']}...""")
|
||||||
|
|
||||||
|
try:
|
||||||
|
p = requests.get(entry['dl_link'])
|
||||||
|
if p.status_code != 200:
|
||||||
|
raise HTTPError(entry['dl_link'], p.status_code, "", None, None)
|
||||||
|
|
||||||
|
parsed_html = BeautifulSoup(p.text, 'html.parser')
|
||||||
|
download_url = parsed_html.body.find('div', attrs={'class':'lock-head'}).find('a').get('href')
|
||||||
|
except Exception as e:
|
||||||
|
click.echo(f"Failed parsing or retrieving HTML link: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
download_filename = filename.format(
|
||||||
|
genre=entry['genre'],
|
||||||
|
artist=entry['artist'],
|
||||||
|
album=entry['album'],
|
||||||
|
title=entry['title'],
|
||||||
|
year=entry['year'],
|
||||||
|
author=entry['author'],
|
||||||
|
orig_name=download_url.split('/')[-1],
|
||||||
|
)
|
||||||
|
download_filename = f"{destination}/{download_filename}"
|
||||||
|
download_path = '/'.join(f"{download_filename}".split('/')[0:-1])
|
||||||
|
|
||||||
|
if not os.path.exists(download_path):
|
||||||
|
os.makedirs(download_path)
|
||||||
|
|
||||||
|
if os.path.exists(download_filename):
|
||||||
|
click.echo(f"File exists at {download_filename}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
click.echo(f"""Downloading from {download_url}...""")
|
||||||
|
attempts = 1
|
||||||
|
p = None
|
||||||
|
try:
|
||||||
|
with requests.get(download_url, stream=True) as r:
|
||||||
|
while attempts <= 5:
|
||||||
|
try:
|
||||||
|
r.raise_for_status()
|
||||||
|
break
|
||||||
|
except Exception:
|
||||||
|
click.echo(f"Download attempt failed: HTTP {r.status_code}; retrying {attempts}/5")
|
||||||
|
sleep(attempts)
|
||||||
|
attempts += 1
|
||||||
|
if r is None or r.status_code != 200:
|
||||||
|
if r:
|
||||||
|
code = r.status_code
|
||||||
|
else:
|
||||||
|
code = "-1"
|
||||||
|
raise HTTPError(download_url, code, "", None, None)
|
||||||
|
with open(download_filename, 'wb') as f:
|
||||||
|
for chunk in r.iter_content(chunk_size=8192):
|
||||||
|
f.write(chunk)
|
||||||
|
click.echo(f"Successfully downloaded to {download_filename}")
|
||||||
|
except Exception as e:
|
||||||
|
click.echo(f"Download attempt failed: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(name='build', short_help='Build the local database.')
|
||||||
|
@click.option(
|
||||||
|
"-o", "--overwrite", '_overwrite', is_flag=True, default=False, envvar='C3DLDB_BUILD_OVERWRITE',
|
||||||
|
help="Overwrite existing database file."
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
"-p", "--pages", "_pages", type=int, default=None, envvar='C3DBDL_BUILD_PAGES',
|
||||||
|
help="Number of pages to scan (default is all)."
|
||||||
|
)
|
||||||
|
def build_database(_overwrite, _pages):
|
||||||
|
"""
|
||||||
|
Initialize the local JSON database of C3DB songs from the website.
|
||||||
|
|
||||||
|
\b
|
||||||
|
The following environment variables can be used for scripting purposes:
|
||||||
|
* C3DLDB_BUILD_OVERWRITE: equivalent to "--overwrite"
|
||||||
|
* C3DBDL_BUILD_PAGES: equivalent to "--pages"
|
||||||
|
"""
|
||||||
|
|
||||||
|
if os.path.exists(config['database_filename']) and not _overwrite:
|
||||||
|
click.echo(f"Database already exists at '{config['database_filename']}'; use '--overwrite' to rebuild.")
|
||||||
|
exit(1)
|
||||||
|
|
||||||
|
click.echo("Building JSON database; this will take a long time...")
|
||||||
|
songs_database = buildDatabase(_pages)
|
||||||
|
click.echo('')
|
||||||
|
click.echo(f"Found {len(songs_database)} songs, dumping to database file '{config['database_filename']}'")
|
||||||
|
if not os.path.exists(config['download_directory']):
|
||||||
|
click.echo(f"Creating download directory '{config['download_directory']}'")
|
||||||
|
os.makedirs(config['download_directory'])
|
||||||
|
with open(config['database_filename'], "w") as fh:
|
||||||
|
json.dump(songs_database, fh, indent=2)
|
||||||
|
fh.write('\n')
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(name='edit', short_help='Edit the local database in EDITOR.')
|
||||||
|
def edit_database():
|
||||||
|
"""
|
||||||
|
Edit the local JSON database of C3DB songs with your $EDITOR.
|
||||||
|
"""
|
||||||
|
|
||||||
|
if not os.path.exists(config['database_filename']):
|
||||||
|
click.echo(f"WARNING: Database filename '{config['database_filename']}' does not exist!")
|
||||||
|
click.echo("Ensure you build a database first with the 'database build' command.")
|
||||||
|
exit(1)
|
||||||
|
|
||||||
|
with open(config['database_filename'], "r") as fh:
|
||||||
|
songs_database = fh.read()
|
||||||
|
|
||||||
|
new_songs_database = click.edit(text=songs_database, require_save=True, extension='.json')
|
||||||
|
while True:
|
||||||
|
if new_songs_database is None:
|
||||||
|
click.echo("Aborting with no modifications")
|
||||||
|
exit(0)
|
||||||
|
|
||||||
|
click.echo('')
|
||||||
|
click.echo("Pending modifications:")
|
||||||
|
click.echo('')
|
||||||
|
diff = list(unified_diff(
|
||||||
|
songs_database.split('\n'),
|
||||||
|
new_songs_database.split('\n'),
|
||||||
|
fromfile='current',
|
||||||
|
tofile='modified',
|
||||||
|
fromfiledate='',
|
||||||
|
tofiledate='',
|
||||||
|
n=3,
|
||||||
|
lineterm=''))
|
||||||
|
for line in diff:
|
||||||
|
if re.match(r'^\+', line) is not None:
|
||||||
|
click.echo(Fore.GREEN + line + Fore.RESET)
|
||||||
|
elif re.match(r'^\-', line) is not None:
|
||||||
|
click.echo(Fore.RED + line + Fore.RESET)
|
||||||
|
elif re.match(r'^\^', line) is not None:
|
||||||
|
click.echo(Fore.BLUE + line + Fore.RESET)
|
||||||
|
else:
|
||||||
|
click.echo(line)
|
||||||
|
click.echo('')
|
||||||
|
|
||||||
|
try:
|
||||||
|
json.loads(new_songs_database)
|
||||||
|
break
|
||||||
|
except Exception:
|
||||||
|
click.echo('ERROR: Invalid JSON syntax.')
|
||||||
|
click.confirm('Continue editing?', abort=True)
|
||||||
|
new_songs_database = click.edit(text=new_songs_database, require_save=True, extension='.json')
|
||||||
|
|
||||||
|
click.confirm('Write modifications to songs database?', abort=True)
|
||||||
|
|
||||||
|
with open(config['database_filename'], "w") as fh:
|
||||||
|
fh.write(new_songs_database)
|
||||||
|
|
||||||
|
|
||||||
|
@click.group(name="database", short_help='Manage the local database.')
|
||||||
|
def database():
|
||||||
|
"""
|
||||||
|
Manage the local JSON database of C3DB songs.
|
||||||
|
"""
|
||||||
|
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
@click.command(name="download", short_help='Download files from C3DB.')
|
||||||
|
@click.option(
|
||||||
|
'-s', '--file-structure', '_file_structure', envvar='C3DBDL_DL_FILE_STRUCTURE',
|
||||||
|
default="{genre}/{artist}/{album}/{title} [{year}] ({author}).{orig_name}",
|
||||||
|
help='Specify the output file/directory stucture.'
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
'-f', '--filter', '_filters', envvar='C3DBDL_DL_FILTERS',
|
||||||
|
default=[], multiple=True,
|
||||||
|
nargs=2,
|
||||||
|
help='Add a filter option.'
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
'-l', '--limit', '_limit', envvar='C3DBDL_DL_LIMIT',
|
||||||
|
default=None, type=int,
|
||||||
|
help='Limit to this many songs (first N matches).'
|
||||||
|
)
|
||||||
|
def download(_filters, _limit, _file_structure):
|
||||||
|
"""
|
||||||
|
Download song(s) from the C3DB webpage.
|
||||||
|
|
||||||
|
\b
|
||||||
|
The output file structure can be specified as a path format with any of the following
|
||||||
|
fields included, surrounded by curly braces:
|
||||||
|
* genre: The genre of the song.
|
||||||
|
* artist: The artist of the song.
|
||||||
|
* album: The album of the song.
|
||||||
|
* title: The title of the song.
|
||||||
|
* year: The year of the album/song.
|
||||||
|
* author: The author of the file on C3DB.
|
||||||
|
* orig_name: The original filename from the website.
|
||||||
|
|
||||||
|
\b
|
||||||
|
The default output file structure is:
|
||||||
|
"{genre}/{artist}/{album}/{title} [{year}] ({author}).{orig_file}"
|
||||||
|
|
||||||
|
\b
|
||||||
|
Filters allow granular selection of the song(s) to download. Multiple filters can be
|
||||||
|
specified, and a song is selected only if ALL filters match (logical AND). Each filter
|
||||||
|
is in the form:
|
||||||
|
--filter [database_key] [value]
|
||||||
|
|
||||||
|
\b
|
||||||
|
The valid "database_key" values are identical to the output file fields above.
|
||||||
|
|
||||||
|
\b
|
||||||
|
For example, to download all songs in the genre "Rock":
|
||||||
|
--filter genre Rock
|
||||||
|
|
||||||
|
\b
|
||||||
|
Or to download all songs by the artist "Rush" and the author "MyName":
|
||||||
|
--filter artist Rush --filter author MyName
|
||||||
|
|
||||||
|
\b
|
||||||
|
The following environment variables can be used for scripting purposes:
|
||||||
|
* C3DBDL_DL_FILE_STRUCTURE: equivalent to "--file-structure"
|
||||||
|
* C3DBDL_DL_FILTERS: equivalent to "--filter"; limited to one instance
|
||||||
|
* C3DBDL_DL_LIMIT: equivalent to "--limit"
|
||||||
|
"""
|
||||||
|
|
||||||
|
with open(config['database_filename'], "r") as fh:
|
||||||
|
all_songs = json.load(fh)
|
||||||
|
click.echo(f"Found {len(all_songs)} songs from JSON database file '{config['database_filename']}'")
|
||||||
|
|
||||||
|
pending_songs = list()
|
||||||
|
|
||||||
|
for song in all_songs:
|
||||||
|
if len(_filters) < 1:
|
||||||
|
add_to_pending = True
|
||||||
|
else:
|
||||||
|
add_to_pending = all(song[_filter[0]] == _filter[1] for _filter in _filters)
|
||||||
|
|
||||||
|
if add_to_pending:
|
||||||
|
pending_songs.append(song)
|
||||||
|
|
||||||
|
if _limit is not None:
|
||||||
|
pending_songs = pending_songs[0:_limit]
|
||||||
|
|
||||||
|
click.echo(f"Downloading {len(pending_songs)} song files...")
|
||||||
|
|
||||||
|
for song in pending_songs:
|
||||||
|
downloadSong(config['download_directory'], _file_structure, song)
|
||||||
|
|
||||||
|
|
||||||
|
@click.group(context_settings=CONTEXT_SETTINGS)
|
||||||
|
@click.option(
|
||||||
|
'-u', '--base-url', '_base_url', envvar='C3DBDL_BASE_URL',
|
||||||
|
default='https://db.c3universe.com/songs/all', show_default=True,
|
||||||
|
help='Base URL of the online C3DB songs page'
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
'-d', '--download-directory', '_download_directory', envvar='C3DBDL_DOWNLOAD_DIRECTORY',
|
||||||
|
default='~/Downloads', show_default=True,
|
||||||
|
help='Download directory for JSON database and songs'
|
||||||
|
)
|
||||||
|
@click.option(
|
||||||
|
'-j', '--json-database', '_json_database', envvar='C3DBDL_JSON_DATABASE',
|
||||||
|
default='c3db.json', show_default=True,
|
||||||
|
help='JSON database filename within download directory'
|
||||||
|
)
|
||||||
|
def cli(_base_url, _download_directory, _json_database):
|
||||||
|
"""
|
||||||
|
C3DB Download Tool
|
||||||
|
|
||||||
|
The C3DB Download Tool allows for easy scraping to a local JSON database and downloading
|
||||||
|
of files from the C3 (Customs Creators Collective) database, a collection of custom songs
|
||||||
|
for Guitar Hero, Rock Band, and similar clone games.
|
||||||
|
|
||||||
|
This tool exists because the C3DB is very hard to mass download from: each song must
|
||||||
|
be found in the extensive list, selected manually, and a second link clicked through,
|
||||||
|
before a random file name is obtained. This tool simplifies the process by first collecting
|
||||||
|
information about all available songs of a particular type, and then is able to download
|
||||||
|
songs based on customizable filters (e.g. by genre, artist, author, etc.) and output them
|
||||||
|
in a standardized format.
|
||||||
|
|
||||||
|
To use the tool, first use the "database" command to build or modify your local JSON
|
||||||
|
database, then use the "download" command to download songs.
|
||||||
|
|
||||||
|
To avoid overloading or abusing the C3DB website, this tool operates exclusively in
|
||||||
|
sequential mode by design; at most one page is scraped (for "database build") or song
|
||||||
|
downloaded (for "download") at once. Additionally, the tool design ensures that the JSON
|
||||||
|
database of songs is stored locally, so it only needs to be built once and then is reused
|
||||||
|
to perform actual downloads without putting further load on the website.
|
||||||
|
|
||||||
|
\b
|
||||||
|
The following environment variables can be used for scripting purposes:
|
||||||
|
* C3DBDL_BASE_URL: equivalent to "--base-url"
|
||||||
|
* C3DBDL_DOWNLOAD_DIRECTORY: equivalent to "--download_directory"
|
||||||
|
* C3DBDL_JSON_DATABASE: equivalent to "--json-database"
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
global config
|
||||||
|
|
||||||
|
# Expand any ~ in the download directory pathname
|
||||||
|
_download_directory = os.path.expanduser(_download_directory)
|
||||||
|
|
||||||
|
# Populate the configuration store
|
||||||
|
config['base_songs_url'] = _base_url
|
||||||
|
config['download_directory'] = _download_directory
|
||||||
|
config['database_filename'] = f"{_download_directory}/{_json_database}"
|
||||||
|
|
||||||
|
|
||||||
|
config = dict()
|
||||||
|
|
||||||
|
database.add_command(build_database)
|
||||||
|
database.add_command(edit_database)
|
||||||
|
|
||||||
|
cli.add_command(database)
|
||||||
|
cli.add_command(download)
|
||||||
|
|
||||||
|
def main():
|
||||||
|
return cli(obj={})
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
|
@ -0,0 +1,4 @@
|
||||||
|
Click
|
||||||
|
requests
|
||||||
|
colorama
|
||||||
|
beautifulsoup4
|
Loading…
Reference in New Issue