Genre-Clawer (Github) is a tool in both maven lib and cli-app created to verify an idea to fetch genres for given music pieces. It started with a post on Lower Hutt Countdown wall and which is a good journey as a weekend hackathon.
Backgrounds
Last week, I saw a post on Lower Hutt countdown community wall. The writer shared a problem in picking songs from a genre sets with given order. I took it as an interesting topic and divided it into two steps:
- Genre tagging;
- Recommendation algorithms.
There comes the Genre-Clawer in the weekend as a maven lib and a CLI binary. It is also a good journey for me to learn new stuffs.
This jar ball is a tool to support the tagging sub-problem and practice selenide as well as maven lib deployment. For a realistic case, I would prefer a fast implementation with Python for the more convenient eco and sweety language. However, regarding most of the dev/test positions in Wellington that I am looking for require tracks of Java coding proof, it is a choice to keep hands warm with Java all possible time.
The Java lib picks BillBoard weekly top100 songs by default to fetch genres. From v1.3 on, the success rate is 92% for the top100 songs. The failed songs genre list will only contain one term "NA".
User Guide
Maven
<repositories> |
<dependency> |
With above dependencies with maven, here is a quick sample to query genre list with song information.
import me.maxwu.genre.htmlUnit.HtmlUnitBase; |
# Execution Logs |
Another sample to use HtmlUnit client in both billboard top100 song list and genre fetching.
import me.maxwu.genre.htmlUnit.HtmlUnitBase; |
#Execution logs |
Since in most situation people will not utilize multiple clients unless in testing, app package is in progress of an update to support a simplified application and efficient improvement. These are on list for v1.4 as next release, which is open for recommendations.
Java Application Cli
Usage:
>java -jar ./target/Genre-Clawer-1.3.jar [-h] | [-n ${size}] [-c ${client}] [-s ${song} | [-a ${artist]]
"-h": Show help message and exit.
"-n": Size of song list fetched from BillBoard Top100 music. This option is int type and defaults to 10.
"-c": Case insensitive client type from {"HtmlUnit", "Selenide" and Jsoup (in progress)}. Option is case insensitive string of client name. HtmlUnit is the default client type. Using Selenide requests Chrome Browser configured up with Selenium and WebDriverManager.
"-s": Song name in string to query genres for a specific music piece. Song name is recommended to wrapped with quotation marks.
"-a": Artist name in string to support the above query with song name. Artist name is recommended to wrapped with quotation marks.
Example 1: Fetch billboard top 5 songs genres
By default, HtmlUnit headless browser is used in this example.
# To fetch billboard top5 songs genres: |
Example 2: Fetch genres for song name "Love On The Brain" and artist name "Rihanna"
Obviously, this example will utilize default HtmlUnit headless browser.
Further works
For the second sub-problem, some additional glue works shall be adapted in between as:
Regulating genre and mapping to dimensions, e.g. with ID3v1
Cached persistence and realistic storage
RESTful API with a presentation layer in front
With the above constructions, generating playlist could be turned to a typical recommendation algorithm.
Appendix
Change logs:
2017-03-21, Initial summary for Genre-Clawer v1.4a1.