Compare commits

..

1 Commits

Author SHA1 Message Date
50db6daf55 Add support for filtering on instrument parts 2023-04-07 14:11:24 -04:00
2 changed files with 142 additions and 224 deletions

View File

@ -105,30 +105,29 @@ guaranteed to be smaller than the total number listed on the C3DB website.
## Searching & Downloading
Once a database has been built, you can start searching for and downloading songs.
Once a database has been built, you can start searching for downloading songs.
To search for songs, use the `search` command. This command takes `--filter` arguments in order to show what
song(s) would be downloaded by a given filter, along with their basic information, without actually triggering
a download. Once you have a valid filter from a search, you can use it to `download` precisely the song(s) you
want.
song(s) would be downloaded by a given filter, without actually triggering the download. Once you have a valid
filter from a search, you can use it to `download` precisely the song(s) you want.
See the following sections for more details on the specifics of the filters and output formatting of the
`search` and `download` commands.
To download songs, use the `download` command. See the following sections for more details on the specifics of
the filters and output formatting of the `download` command.
By default, when downloading a given song, all possible download links (`dl_links`) will be downloaded; this
can be limited by using the `-i`/`--download-id` and `-d`/`--download-descr` options to pick and choose specific
files. A specific example usecase would be to specify `--download-descr 360` to only download Xbox 360 RBCONs.
files.
Once a song has been downloaded, assuming that the file structure doesn't change, subsequent `download` runs will
not overwrite it and will simply skip downloading the file.
### Filtering
Filtering out the songs in the database is a key part of this tool. You might want to be able to grab only songs
with certain genres, artists, instruments, etc. or by certain authors, to make your custom song packs.
Filtering out the songs in the database is a key part of this tool. You might want to be able to grab only select
genres, artists, authors, etc. to make your custom song packs, or songs with particular instruments.
If multiple filters are specified, they are treated as a logical AND, i.e. *all* of the give filters must apply
to a song for it to be matched.
If multiple filters, of either type, are specified, they are treated as a logical AND, i.e. all the listed
filters, both information and instrument, must apply to a given song for it to be matched.
Filtering is always done during the search/download stage; the JSON database will always contain all possible
entries from the C3DB.
@ -161,8 +160,8 @@ Found 1 matching songs:
> Song: "Rush - Sweet Miracle" from "Vapor Trails [Remixed] (2002)" by ejthedj
Instruments: guitar [2], bass [3], drums [4], vocals [4], keys [None]
Available downloads:
* Rock Band 3 Xbox 360
Available downloads:
* Rock Band 3 Xbox 360
```
In this case, one song matched; applying the same filter to a `download` would thus download only the single song.
@ -178,34 +177,24 @@ instruments that can be filtered on:
* `vocals`
* `keys`
To use instrument filters, append one or more `--filter instrument <instrument>` options to your `c3dbdl search` or
`download` command. An instrument filter option begins with the literal `--filter instrument`, followed by the
instrument you wish to filter on.
To use instrument filters, append one or more `--instrument-filter` options to your `c3dbdl search` or `download`
command. An instrument filter option begins with the literal `--instrument-filter`, followed by the instrument
you wish to filter on.
If a part contains the instrument at any difficulty (from 0-6), it will match the filter; if the instrument part
is missing, it will not match.
You can also invert the match by adding `no-` to the instrument name. So `--filter instrument no-keys` would
You can also invert the match by adding `no-` to the instrument name. So `--instrument-filter no-keys` would
only match songs *without* a keys part.
For example, to find all songs by Rush which have a keys part but no vocal part:
For example, to find all songs by Rush which have a keys part:
```
c3dbdl search --filter artist Rush --filter instrument keys --filter instrument no-vocals
Found 19562 songs from JSON database file 'Downloads/c3db.json'
Found 1 matching songs:
> Song: "Rush - La Villa Strangiato" from "Hemispheres (1978)" by DoNotPassGo
Instruments: guitar [6], bass [5], drums [6], vocals [None], keys [1]
Available downloads:
* Rock Band 3 Xbox 360
* Rock Band 3 Wii
* Rock Band 3 PS3
* Phase Shift
* Rock Band 3 Xbox 360 (Alternate Version)
```
In this case, one song matched; applying the same filter to a `download` would thus download only the single song.
If this case, X songs matched (only one shown for simplicity); applying the same filter to a `download` would
thus download all X songs.
### Output Format

View File

@ -153,6 +153,8 @@ def fetchSongData(entries):
for link_entry in download_links:
link = link_entry.get("href")
description = link_entry.get_text().strip()
if "c3universe.com" not in link:
continue
messages.append(f"Found download link: {link} ({description})")
dl_links.append(
{
@ -250,70 +252,6 @@ def buildDatabase(pages, concurrency):
return found_songs
def downloadFile(download_url, download_path, download_filename):
attempts = 1
p = None
try:
with requests.get(download_url, stream=True) as r:
while attempts <= 3:
try:
r.raise_for_status()
break
except Exception:
click.echo(
f"Download attempt failed: HTTP {r.status_code}; retrying {attempts}/3"
)
sleep(attempts)
attempts += 1
if r is None or r.status_code != 200:
if r:
code = r.status_code
else:
code = "-1"
raise HTTPError(download_url, code, "", None, None)
if not os.path.exists(download_path):
os.makedirs(download_path)
with open(download_filename, "wb") as f:
for chunk in r.iter_content(chunk_size=8192):
f.write(chunk)
click.echo(f"Successfully downloaded to {download_filename}")
except Exception as e:
click.echo(f"Download attempt failed: {e}")
return None
def parseC3Universe(dl_link):
try:
p = requests.get(dl_link)
parsed_html = BeautifulSoup(p.text, "html.parser")
download_element = (
parsed_html.body.find("div", attrs={"class": "lock-head"})
.find("a")
)
download_url = download_element.get("href")
return download_url
except Exception as e:
click.echo(f"Failed parsing or retrieving file URL from link {dl_link}: {e}")
return None
def parseMediafire(dl_link):
try:
p = requests.get(dl_link)
parsed_html = BeautifulSoup(p.text, "html.parser")
download_element = parsed_html.find(
"a", attrs={"aria-label": "Download file"}
)
if download_element is not None:
download_url = download_element.get("href")
else:
download_url = re.search(r"'(http[s]*://download[0-9]+.mediafire.com/.*)';", p.text).group(1)
return download_url
except Exception as e:
click.echo(f"Failed parsing or retrieving file URL from link {dl_link}: {e}")
return None
def downloadSong(destination, filename, entry, dlid, dldesc):
click.echo(
f"""> Downloading song "{entry['artist']} - {entry['title']}" by {entry['author']}..."""
@ -340,17 +278,19 @@ def downloadSong(destination, filename, entry, dlid, dldesc):
return
for dl_link in dl_links:
if 'dl.c3universe.com' in dl_link['link']:
download_url = parseC3Universe(dl_link["link"])
elif 'www.mediafire.com' in dl_link["link"]:
download_url = parseMediafire(dl_link["link"])
else:
click.echo("Download URL is not valid for CLI download; skipping...")
click.echo(f"URL: {dl_link['link']}")
continue
try:
p = requests.get(dl_link["link"])
if p.status_code != 200:
raise HTTPError(dl_link["link"], p.status_code, "", None, None)
if download_url is None:
click.echo(f"No valid download URL found, skipping...")
parsed_html = BeautifulSoup(p.text, "html.parser")
download_url = (
parsed_html.body.find("div", attrs={"class": "lock-head"})
.find("a")
.get("href")
)
except Exception as e:
click.echo(f"Failed parsing or retrieving HTML link: {e}")
continue
download_filename = filename.format(
@ -372,7 +312,38 @@ def downloadSong(destination, filename, entry, dlid, dldesc):
click.echo(f"File exists at {download_filename}")
continue
downloadFile(download_url, download_path, download_filename)
attempts = 1
p = None
try:
with requests.get(download_url, stream=True) as r:
while attempts <= 3:
try:
r.raise_for_status()
break
except Exception:
click.echo(
f"Download attempt failed: HTTP {r.status_code}; retrying {attempts}/3"
)
sleep(attempts)
attempts += 1
if r is None or r.status_code != 200:
if r:
code = r.status_code
else:
code = "-1"
raise HTTPError(download_url, code, "", None, None)
if not os.path.exists(download_path):
os.makedirs(download_path)
with open(download_filename, "wb") as f:
for chunk in r.iter_content(chunk_size=8192):
f.write(chunk)
click.echo(f"Successfully downloaded to {download_filename}")
except Exception as e:
click.echo(f"Download attempt failed: {e}")
continue
@click.command(name="build", short_help="Build the local database.")
@click.option(
@ -522,10 +493,21 @@ def database():
"-f",
"--filter",
"_filters",
envvar="C3DBDL_DL_FILTERS",
default=[],
multiple=True,
nargs=2,
help="Add a search filter.",
help="Add a filter option.",
)
@click.option(
"-s",
"--instrument-filter",
"_instrument_filters",
envvar="C3DBDL_DL_INSTFILTERS",
default=[],
multiple=True,
nargs=1,
help="Add an instrument filter."
)
@click.option(
"-l",
@ -540,7 +522,6 @@ def database():
"-i",
"--download-id",
"_id",
envvar="C3DBDL_DL_ID",
default=None,
type=int,
help='Download only "dl_links" entry N (1 is first, etc.), or all if unspecified.',
@ -549,20 +530,20 @@ def database():
"-d",
"--download-descr",
"_desc",
envvar="C3DBDL_DL_DESCR",
default=None,
help='Download only "dl_links" entries with this in their description (fuzzy).',
)
def download(_filters, _id, _desc, _limit, _file_structure):
def download(_filters, _instrument_filters, _id, _desc, _limit, _file_structure):
"""
Download song(s) from the C3DB webpage.
Filters allow granular selection of the song(s) to download. Multiple filters can be
specified, and a song is selected only if ALL filters match (logical AND). Filters are
specified in the form "--filter <field> <value>".
specified in the form "--filter <database_key> <value>" for information filters, or
"--instrument-filter [no-]<instrument>" for instrument filters.
For a full list of and explanation for filters, see the help output for the "search"
command ("c3dbdl search --help").
For a full list of and explanation for filters, see the "search" command help
(command "c3dbdl search --help").
In addition to filters, each song may have more than one download link, to provide
multiple versions of the same song (for example, normal and multitracks, or alternate
@ -588,9 +569,8 @@ def download(_filters, _id, _desc, _limit, _file_structure):
\b
The following environment variables can be used for scripting purposes:
* C3DBDL_DL_FILE_STRUCTURE: equivalent to "--file-structure"
* C3DBDL_DL_FILTERS: equivalent to "--filter"; limited to one instance
* C3DBDL_DL_LIMIT: equivalent to "--limit"
* C3DBDL_DL_ID: equivalent to "--download-id"
* C3DBDL_DL_DESCR: equivalent to "--download-descr"
"""
with open(config["database_filename"], "r") as fh:
@ -602,44 +582,16 @@ def download(_filters, _id, _desc, _limit, _file_structure):
pending_songs = list()
for song in all_songs:
add_to_pending = True
song_filters = _filters
song_information_filters = list()
song_instrument_filters = list()
if len(_filters) > 0:
# Extract the instrument filters
for _filter in song_filters:
if _filter[0] == "instrument":
song_instrument_filters.append(_filter[1].lower())
else:
song_information_filters.append(_filter)
if len(song_information_filters) > 0 or len(song_instrument_filters) > 0:
if len(_filters) < 1 and len(_instrument_filters) < 1:
add_to_pending = True
else:
# Parse the information filters
if len(song_information_filters) > 0:
if len(_filters) > 0:
try:
pending_information_filters = list()
for information_filter in song_information_filters:
filter_field = information_filter[0].lower()
filter_value = information_filter[1].lower()
if re.match("^!", filter_value):
filter_value = filter_value.replace("!", "")
if filter_value in song[filter_field].lower():
pending_information_filters.append(False)
else:
pending_information_filters.append(True)
elif re.match("^~", filter_value):
filter_value = filter_value.replace("~", "")
if filter_value in song[filter_field].lower():
pending_information_filters.append(True)
else:
pending_information_filters.append(False)
else:
if filter_value == song[filter_field].lower():
pending_information_filters.append(True)
else:
pending_information_filters.append(False)
pending_information_filters = [
_filter[1].lower() in song[_filter[0]].lower()
for _filter in _filters
]
information_add_to_pending = all(pending_information_filters)
except KeyError as e:
click.echo(f"Invalid filter field {e}")
@ -648,12 +600,12 @@ def download(_filters, _id, _desc, _limit, _file_structure):
information_add_to_pending = True
# Parse the instrument filters
if len(song_instrument_filters) > 0:
if len(_instrument_filters) > 0:
try:
pending_instrument_filters = list()
for instrument_filter in song_instrument_filters:
for instrument_filter in _instrument_filters:
if re.match("^no-", instrument_filter):
instrument_filter = instrument_filter.replace("no-", "")
instrument_filter = instrument_filter.replace('no-', '')
if song["instruments"][instrument_filter] is None:
pending_instrument_filters.append(True)
else:
@ -670,9 +622,7 @@ def download(_filters, _id, _desc, _limit, _file_structure):
else:
instrument_add_to_pending = True
add_to_pending = all(
[information_add_to_pending, instrument_add_to_pending]
)
add_to_pending = all([information_add_to_pending, instrument_add_to_pending])
if add_to_pending:
pending_songs.append(song)
@ -691,28 +641,45 @@ def download(_filters, _id, _desc, _limit, _file_structure):
"-f",
"--filter",
"_filters",
envvar="C3DBDL_DL_FILTERS",
default=[],
multiple=True,
nargs=2,
help="Add a search filter.",
help="Add an information filter.",
)
def search(_filters):
@click.option(
"-s",
"--instrument-filter",
"_instrument_filters",
envvar="C3DBDL_DL_INSTFILTERS",
default=[],
multiple=True,
nargs=1,
help="Add an instrument filter."
)
def search(_filters, _instrument_filters):
"""
Search for song(s) from the C3DB local database.
Filters allow granular selection of the song(s) to download. Multiple filters can be
specified, and a song is selected only if ALL filters match (logical AND). Filters are
specified in the form "--filter <field> <value>".
specified in the form "--filter <database_key> <value>" for information filters, or
"--instrument-filter [no-]<instrument>" for instrument filters.
Information filters match against the basic information of a song, for example finding
songs by a given artist, from a given album, by a given chart author, etc.
Filter values are fuzzy and case insensitive, so for example "word" would match
against a song titled "In The Word".
\b
The valid fields for the "<field>" value are:
The valid fields for the "<database_key>" value are:
* genre: The genre of the song.
* artist: The artist of the song.
* album: The album of the song.
* title: The title of the song.
* year: The year of the album/song.
* author: The author of the file on C3DB.
* instrument: An instrument chart for the song.
\b
For example, to download all songs in the genre "Rock":
@ -722,44 +689,34 @@ def search(_filters):
Or to download all songs by the artist "Rush" and the author "MyName":
--filter artist Rush --filter author MyName
Filter values are case insensitive, and non-instrument filters can be made fuzzy by
adding a tilde ("~") to the beginning of the "<value>".
\b
For example, to match all songs with "Word" in their titles:
--filter title ~word
A filter can be negated by adding an exclamation mark ("!") to the beginning of the
"<value>". Note that "!" must be escaped or single-quoted under BASH.
\b
For example, to match all songs except those by Yes as their artist:
--filter artist '!Yes'
Instrument filters allow selection of the presence of instruments. If an instrument
fitler is given, only songs which contain parts for the given instrument(s) will be
shown.
\b
The valid instruments are:
The valid instruments (case-sensitive, i.e. must be lowercase) are:
* guitar
* bass
* drums
* vocals
* keys
To negate an instrument filter and find only entires without the specified
instrument, append "no-" to the instrument name.
To negate an instrument filter, append "no-" to the instrument name.
\b
For example, to download only songs that have a keys part but no vocal part:
--filter instrument keys --filter instrument no-vocals
--instrument-filter keys --instrument-filter no-vocals
Note that while instrument difficulties are displayed in the output of this command,
they can not be filtered on; this is up to the user to do manually. The purpose of
instrument filters is to ensure that songs contain or don't contain given parts, not
to granularly select the difficulty of said parts (that's for the players of the game
to do, not us).
\b
The following environment variables can be used for scripting purposes:
* C3DBDL_DL_FILTERS: equivalent to "--filter"; limited to one instance
* C3DBDL_DL_INSTFILTERS: equivalent to "--instrument-filter"; limited to one instance
"""
with open(config["database_filename"], "r") as fh:
@ -771,44 +728,16 @@ def search(_filters):
pending_songs = list()
for song in all_songs:
add_to_pending = True
song_filters = _filters
song_information_filters = list()
song_instrument_filters = list()
if len(_filters) > 0:
# Extract the instrument filters
for _filter in song_filters:
if _filter[0] == "instrument":
song_instrument_filters.append(_filter[1].lower())
else:
song_information_filters.append(_filter)
if len(song_information_filters) > 0 or len(song_instrument_filters) > 0:
if len(_filters) < 1 and len(_instrument_filters) < 1:
add_to_pending = True
else:
# Parse the information filters
if len(song_information_filters) > 0:
if len(_filters) > 0:
try:
pending_information_filters = list()
for information_filter in song_information_filters:
filter_field = information_filter[0].lower()
filter_value = information_filter[1].lower()
if re.match("^!", filter_value):
filter_value = filter_value.replace("!", "")
if filter_value in song[filter_field].lower():
pending_information_filters.append(False)
else:
pending_information_filters.append(True)
elif re.match("^~", filter_value):
filter_value = filter_value.replace("~", "")
if filter_value in song[filter_field].lower():
pending_information_filters.append(True)
else:
pending_information_filters.append(False)
else:
if filter_value == song[filter_field].lower():
pending_information_filters.append(True)
else:
pending_information_filters.append(False)
pending_information_filters = [
_filter[1].lower() in song[_filter[0]].lower()
for _filter in _filters
]
information_add_to_pending = all(pending_information_filters)
except KeyError as e:
click.echo(f"Invalid filter field {e}")
@ -817,12 +746,12 @@ def search(_filters):
information_add_to_pending = True
# Parse the instrument filters
if len(song_instrument_filters) > 0:
if len(_instrument_filters) > 0:
try:
pending_instrument_filters = list()
for instrument_filter in song_instrument_filters:
for instrument_filter in _instrument_filters:
if re.match("^no-", instrument_filter):
instrument_filter = instrument_filter.replace("no-", "")
instrument_filter = instrument_filter.replace('no-', '')
if song["instruments"][instrument_filter] is None:
pending_instrument_filters.append(True)
else:
@ -839,9 +768,7 @@ def search(_filters):
else:
instrument_add_to_pending = True
add_to_pending = all(
[information_add_to_pending, instrument_add_to_pending]
)
add_to_pending = all([information_add_to_pending, instrument_add_to_pending])
if add_to_pending:
pending_songs.append(song)
@ -850,7 +777,7 @@ def search(_filters):
click.echo()
for entry in pending_songs:
click.echo(
f"""> Song: "{entry['artist']} - {entry['title']}" ({entry['length']}, {entry['genre']}) from "{entry['album']} ({entry['year']})" by {entry['author']}"""
f"""> Song: "{entry['artist']} - {entry['title']}" from "{entry['album']} ({entry['year']})" by {entry['author']}"""
)
instrument_list = list()
@ -859,8 +786,10 @@ def search(_filters):
click.echo(
f""" Instruments: {', '.join(instrument_list)}""",
)
click.echo(""" Available downloads:""")
click.echo(
f""" Available downloads:"""
)
for link in entry["dl_links"]:
click.echo(f""" * {link['description']}""")
click.echo()