
Audio Search Engine
Voice-powered search with NLP and speech recognition
What It Does
Core capabilities of the Audio Search Engine.
Speech Recognition
State-of-the-art NLP models for high-precision voice-to-text conversion in noisy environments.
Semantic Search
Goes beyond keywords to understand the context and intent behind every voice query.
Scalable Indexing
Fast and memory-efficient indexing architecture for searching through millions of audio files.
Audio Fingerprinting
Identify audio clips through unique acoustic signatures with minimal data requirements.
Project Source Code
Browse the core Python modules that power the audio search engine.
1import os2import src.analyzer as analyzer3from src.filereader import FileReader4from termcolor import colored5from src.db import SQLiteDatabase67MUSICS_FOLDER_PATH = "mp3"89if __name__ == '__main__':10 db = SQLiteDatabase()1112 for filename in os.listdir(MUSICS_FOLDER_PATH):13 # Skip hidden files and non-WAV files14 if not filename.endswith(".wav") or filename.startswith('.'):15 continue1617 try:18 file_path = os.path.join(MUSICS_FOLDER_PATH, filename)19 reader = FileReader(file_path)20 audio = reader.parse_audio()21 except Exception as e:22 print(colored(f"Error processing {filename}: {str(e)}", "red"))23 continue2425 song = db.get_song_by_filehash(audio['file_hash'])2627 if not song:28 song_id = db.add_song(filename, audio['file_hash'])29 else:30 song_id = song['id']3132 print(colored(f"Analyzing music: {filename}", "green"))3334 hash_count = db.get_song_hashes_count(song_id)35 if hash_count > 0:36 msg = f'Warning: This song already exists ({hash_count} hashes), skipping'37 print(colored(msg, 'yellow'))38 continue3940 hashes = set()4142 for channeln, channel in enumerate(audio['channels']):43 channel_hashes = analyzer.fingerprint(channel, Fs=audio['Fs'])44 channel_hashes = set(channel_hashes)45 msg = f'Channel {channeln} saved {len(channel_hashes)} hashes'46 print(colored(msg, attrs=['dark']))47 hashes |= channel_hashes4849 values = [(song_id, hash, offset) for hash, offset in hashes]50 db.store_fingerprints(values)5152 print(colored('Done', "green"))Live Terminal Output
See the audio fingerprinting and matching in action. Click Run to simulate.
Get the Code
Clone the repository and start searching from your terminal.
Audio-Search-Engine
A Python-based audio fingerprinting and recognition system. Analyzes WAV files, generates acoustic fingerprints, stores them in SQLite, and matches recorded audio against the database in real-time.