I’m wondering how I can use cGPT in a particular usecase and if so how can I go about feeding training data to it?
Whati am trying to accomplish: I want to be able to supply cGPT with a music file (.ogg or .mp3) and get an accuracy of .001 BPM as to what the BPM of a song is. Huge bonus points if it can also print out at which second (down to .001 sec) where a BPM would change in a song.
This is not a cGPT application, this is a deep learning application. So for the question can a deep learning process do this? Absolutely.
is there an open way to go about using deep learning? Is it something as accessible as cGPT?
No, nowhere near as accessible, but you can still learn it off the internet. Depends on how much effort you want to put into this project, really. The kind of thing you’re trying to do is pretty involved and will take a lot of trial and error, time, and effort to get working well. People have put in a lot of effort to make it easier but it’s not a trivial task.
If you’re really interested, I’d recommend looking into simple neural network tutorials on YouTube, specifically through tensorflow or (if you have institutional access) Matlab.
Can’t most DJ software do this?
DJ software is extremely inaccurate. It’s good for a rough estimate, but it can be wildly wrong at times.
The change of BPM (beats per minute) from one value to another can not be arbitrarily precise. At 60 BPM, there is only one per second, you want 0.001 s resolution, that is 1-thousandths of a beat. A 1 kHz tone only does one full wavelength in that time.
It also depends on how long the samples are. A 0.2 second sample of hardly going to give a BPM at all.
Maybe you can get down to fractions of delta-BPM at high initial BPM and long samples. But that is it.
Then there is the actually big question how it is even relevant? How would it be relevant if it is 60 or 60.001 BPM?
This application of deep learning would apply to music suitable for playing DDR/ITG/Stepmania/Stepmaniax/PIU etc.; essentially music gaming:
Most music that would be reasonably fun to play falls within 110-240BPM and runs between 2.5 and 7 minutes long. At 110BPM, a song with a coded 110BPM, but a true BPM of 110.001 will drift by roughly 2ms. Music games are predicated on timing precision down to 15ms as a minimum. I, myself, hit notes within a rough range of 6ms at my best (and I’m barely top 100 in the world).
You can produce the audio with arbitrary temporal precision, the issue is that this precision is simply impossible to reconstruct given the low number of “virtual sample points per time” (as in relevant for the BPM), same goes for the discussed wavelength of the actual sound, putting up yet another limit, where just measuring the frequency becomes less and less accurate/possible.
i dont have much to contribute, but I’m curious what your end goal of this project is? Sounds interesting