You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: DEVELOPERS.md
+43Lines changed: 43 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -261,6 +261,49 @@ The new analysis pipeline uses a job queue based on Celery/RabbitMQ. RabbitMQ co
261
261
for Freesound async tasks other than analysis).
262
262
263
263
264
+
### Supported audio descriptors for search and sound metadata
265
+
266
+
By combining the output of one or several audio analyzers (see the sections below), Freesound has a way to make audio descriptors
267
+
avialable for filtering search queries and as sound metadata fields through the API. In this way, an API user is able to make
268
+
queries and filter by both textual sound properties like tags and audio descriptors. Also, in the search results returned through the API,
269
+
a user can specify which descriptor values should be returned, much like any other standard sound metadata field (see API docs for more info).
270
+
271
+
The way to specify which audio descriptors should be available as fields/filters, is trhough the `settings.CONSOLIDATED_AUDIO_DESCRIPTORS`
272
+
configuration parameter. There, a list of descriptors is defined together with the way to find their values based on the output of audio
273
+
analyzers. When analyzing sounds with audio analyzers, the results of each analyzer will be saved in disk (either with a .json or .yaml) file.
274
+
Also, a `SoundAnalysis` object for every analyzer/sound pair will be created (see the sections above). Analyzers output will only be saved in disk,
275
+
and will not be loaded in the corresponding `SoundAnalysis` object. The `Sound` model has a `Sound.consoidate_analysis()` method which, when run, will
276
+
create a new `SoundAnalysis` object of analyzer type `consolidated` (`settings.CONSOLIDATED_ANALYZER_NAME`), will collect all the relevant descriptors
277
+
data (following the `settings.CONSOLIDATED_AUDIO_DESCRIPTORS` list) from each individual analyzer output file, and will load the collected data in
278
+
the `analysis_datya` field of the newly created `SoundAnalysis` object. This is how only the relevant audio descriptors data is load to the DB in the
279
+
consolidated `SoundAnalysis` object. `Sound.consoidate_analysis()` is called every time a new analyzer relevant to `settings.CONSOLIDATED_AUDIO_DESCRIPTORS`
280
+
finished an analysis task so sounds are updated "automaticlly". Also there is a management command `create_consolidated_sound_analysis_and_sim_vectors` that
281
+
will help creating these objects in bulk.
282
+
283
+
For every descriptor defined in `settings.CONSOLIDATED_AUDIO_DESCRIPTORS` there are a number of options that can be set, including some value transformations
284
+
and whether the descriptor should be indexed in the search engine. If no options are set, sensible defaults will be used. The current definition of
285
+
`settings.CONSOLIDATED_AUDIO_DESCRIPTORS` should hopefully be self-explanatory.
286
+
287
+
When consolidated analyses are loaded in `SoundAnalysis` objects in the DB, adding sounds to the search engine will also include these
288
+
descriptors and therefore these will be available for filtering in search queries. Note that multi-dimensional descriptors will not be indexed
289
+
(they are not useful for filtering), and also descriptors marked with `index: False` will be skipped.
290
+
291
+
292
+
### Similarity search and similarity spaces
293
+
294
+
Similarly to how audio descriptors work, Freesound can also define a number of "similarity spaces" that can be used for similarity search
295
+
and that are based on the output of audio analyzers. Similarity spaces are defined through `settings.SIMILARITY_SPACES`. The entries
296
+
of `settings.SIMILARITY_SPACES` define a number of properties like from which analyzer/property value the vector should be obtained,
297
+
if the vector should be L2 normalized, etc. Checking the current definition of `settings.SIMILARITY_SPACES` should hopefully be self-explanatory.
298
+
299
+
The `Sound` model has a `Sound.load_similarity_vectors()` which will create corresponding `SoundSimilarityVector` objects for each pair of
300
+
sound and type of similarity space. Once the vectors are loaded in the DB, they can be indexed in the search engine and also be used as targets
301
+
for a similarity search. `Sound.load_similarity_vectors()` is called when any relevant analyser to `settings.SIMILARITY_SPACES` finished analysing
302
+
a sound, therefore vectors should be automatically loaded (and also indexed in the search engine as sounds will also be marked as "index dirty" when
303
+
new similarity vector objects are created). The management command `create_consolidated_sound_analysis_and_sim_vectors` can be used to help creating
Copy file name to clipboardExpand all lines: README.md
+11-2Lines changed: 11 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,7 +45,7 @@ Below are instructions for setting up a local Freesound installation for develop
45
45
freesound-data/previews/
46
46
freesound-data/analysis/
47
47
48
-
4. Download [Freesound development similarity index](https://drive.google.com/file/d/1ydJUUXbQZbHrva4UZd3C05wDcOXI7v1m/view?usp=sharing) and the [Freesound tag recommendation models](https://drive.google.com/file/d/1snaktMysCXdThWKkYuKWoGc_Hk2BElmz/view?usp=sharing) and place their contents under `freesound-data/similarity_index/` and `freesound-data/tag_recommendation_models`directories respectively (you'll need to create the directories).
48
+
4. Download the [Freesound tag recommendation models](https://drive.google.com/file/d/1snaktMysCXdThWKkYuKWoGc_Hk2BElmz/view?usp=sharing) and place the contents under `freesound-data/tag_recommendation_models`directory (you'll need to create that directory).
49
49
50
50
5. Rename `freesound/local_settings.example.py` file, so you can customise Django settings if needed and create a `.env` file with your local user UID and other useful settings. These other settings include `COMPOSE_PROJECT_NAME` and `LOCAL_PORT_PREFIX` which can be used to allow parallel local installations running on the same machine (provided that these to variables are different in the local installations), and `FS_BIND_HOST` which you should set to `0.0.0.0` if you need to access your local Freesound services from a remote machine.
51
51
@@ -115,7 +115,16 @@ If you a prompted for a password, use `localfreesoundpgpassword`, this is define
115
115
116
116
Because the `web` container mounts a named volume for the home folder of the user running the shell plus process, command history should be kept between container runs :)
117
117
118
-
16. (extra step) The steps above will get Freesound running, but to save resources in your local machine some non-essential services will not be started by default. If you look at the `docker-compose.yml` file, you'll see that some services are marked with the profile `analyzers` or `all`. These services include sound similarity, search results clustering and the audio analyzers. To run these services you need to explicitly tell `docker compose` using the `--profile` (note that some services need additional configuration steps (see *Freesound analysis pipeline* section in `DEVELOPERS.md`):
118
+
16. (extra) Load audio descriptors and similarity vectors to the database and reindex the search index. This is necessary to make audio descriptors available thorugh the API and to make similarity search work. Note that for this to work, you need to have properly set the development data folder, and you should see some files inside the `freesound-data/analysis` folders which store the (previously computed) results of Freesound audio analysers.
119
+
120
+
# First run the following command which will create relevant objects in the DB. Note that this can take some minutes.
121
+
docker compose run --rm web python manage.py create_consolidated_sound_analysis_and_sim_vectors --force
122
+
123
+
# Then re-create the search engine sounds index after audio descriptors data has been loaded in the DB. You need to specifically indicate that similarity vectors should be added.
124
+
docker compose run --rm web python manage.py reindex_search_engine_sounds --include-similarity-vectors
125
+
126
+
127
+
The steps above will get Freesound running, but to save resources in your local machine some non-essential services will not be started by default. If you look at the `docker-compose.yml` file, you'll see that some services are marked with the profile `analyzers` or `all`. These services include sound tag recommendation and the audio analyzers. To run these services you need to explicitly tell `docker compose` using the `--profile` (note that some services need additional configuration steps (see *Freesound analysis pipeline* section in `DEVELOPERS.md`):
119
128
120
129
docker compose --profile analyzers up # To run all basic services + sound analyzers
121
130
docker compose --profile all up # To run all services
Copy file name to clipboardExpand all lines: _docs/api/source/resources.rst
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -183,6 +183,10 @@ laion_clap 512 This space is built using LAION-CL
183
183
freesound_classic 100 This space is built using a combination of low-level acoustic audio features extracted using the ``FreesoundExtractor`` from the Essentia audio analysis library (https://essentia.upf.edu). We currently don't provide code to extract these features from arbitrary audio, but we might do that in the future.
When using vectors as input for the ``similar_to`` parameter, make sure that the vectors are extracted using the same method as the one used to build the similarity space.
187
+
Note that L2-normalisation is automatically applied to input vectors.
188
+
If the provided vector is already L2-normalized, this will have no effect.
0 commit comments