audioperm package¶
Submodules¶
audioperm.audioperm module¶
-
class
audioperm.audioperm.
AudioPerm
(audio, sr=22050, **kwargs)¶ Bases:
object
The main class for audioperm. Takes an audio file (or a batch of files) path or numpy array (int16, float). Internal audio representation is pcm 16 (not same as librosa default).
-
permute
(n_permutations=1, interm_silence=1000)¶ Get the permutation of words. TODO: Use yield.
- Parameters
n_permutations (int) – Number of (max) permutations to return
interm_silence (int) – Intermediate silence between words (in ms).
- Returns
Union[
list
oflist
ofndarray
,list
ofndarray
]
-
word_segments
(silence_thresh=- 60.0, min_silence_len=5, return_words=True)¶ Segments the audio files into multiple segments or words. TODO: Improve word segmentation. Add label wise segmentation (If given n words as labels, find n appropriate words).
- Parameters
silence_thresh (float) – Silence threshold for segmenting the audio. Same as pydub.
min_silence_len (int) – Minimum silence lenth (in ms). Same as pydub.
- Returns
Union[
list
oflist
ofndarray
,list
ofndarray
]
-
audioperm.utils module¶
Helper functions for audioperm.
-
audioperm.utils.
max_min_heuristics
(sig, max_perc=0.2, min_perc=0.2)¶ Calculates the avg max and avg min considering a percentage of sorted amplitudes. For audio signals finding a single peak or valley is not enough. So, we take the average of top perc percentage of the population. :param sig: a numpy array :type sig: ndarray :param max_perc: Population percentage for taking max :type max_perc: float :param min_perc: Population percentage for taking max :type min_perc: float
- Returns
- tuple containing:
max_p(float): population max for positive signal min_p(float): population min for positive signal max_n(float): population max for negative signal min_n(float): population min for negative signal
- Return type
(tuple)
-
audioperm.utils.
noise_boundaries
(sig, max_perc=0.2, min_perc=0.2)¶ Calculates maximum noise boundaries for a signal. :param sig: a numpy array :type sig: ndarray :param max_perc: Population percentage for taking max :type max_perc: float :param min_perc: Population percentage for taking max :type min_perc: float
- Returns
- tuple containing:
max_n(float): maximum boundary for noise min_n(float): minimum boundary for noise
- Return type
(tuple)
-
audioperm.utils.
save_audio
(sig, filename, sr=22050)¶ Takes a PCM 16 or float32 signal and saves the audio in pcm16 format. :param sig: a numpy array :type sig: ndarray :param filename: Filepath and filename. :type filename: str :param sr: Sampling rate. :type sr: int
-
audioperm.utils.
type_nested
(iterable, tp)¶ Finds if array is of type tp (homogenous). :param iterable: a list :type iterable: list :param tp: type of iterable :type tp: type
- Returns
If all are of same type.
- Return type
bool
Module contents¶
A python library for generating different permutations of audible segments from audio files.