Let’s say you have an audio recording of a participant reading passages, talking about their day, or speaking where another person is not involved. The workflow looks like this:

single_audio.png

The main tasks for processing and analyzing this audio file are:

  1. Vocal acoustics
  2. Transcription + speech characteristics

<aside> ⚠️

OpenWillis currently supports .wav and .mp3 files for audio processing. If your files are in another format, such as .m4a, follow this link for a brief tutorial on how to include file conversion in your code.

</aside>

1 - Vocal acoustics

The basic vocal_acoustics function processes a single file at a time and can be run with a single line of code:

framewise, summary = ow.vocal_acoustics(audio_path = 'audio.wav', voiced_segments = False, option = 'simple')

summary # to examine the output

To save this output into a .csv file to analyze later:

output_dir = '/Users/researcher/project/output/' # change to your output path
output_filename = 'summary.csv'
output_csv_path = os.path.join(output_dir, output_filename)

summary.to_csv(output_csv_path, index = False)

To process multiple audio files at the same time, you can use a for loop with the same function:

folder_path = '/Users/researcher/project/data/audio_files/wav_files' # make sure to change to the path that contains your data

framewise_data = pd.DataFrame() # initiaite these dataframes for storing results
summary_data = pd.DataFrame()

for filename in os.listdir(folder_path):
  if filename.endswith('.wav'):
    audio_path = os.path.join(folder_path, filename)

    # Run vocal acoustics function
    framewise, summary = ow.vocal_acoustics(audio_path = audio_path, voiced_segments = False, option = 'simple')

    # Here, make sure we can identify each file by adding the name in the first column of the dataframe, remove '.wav' from the name
    filename_no_ext = os.path.splitext(filename)[0]

    # Add filename column as the first column using insert()
    framewise.insert(0, 'filename', filename_no_ext)
    summary.insert(0, 'filename', filename_no_ext)

    # Store results for each file in each dataframe
    framewise_data = pd.concat([framewise_data, framewise], ignore_index=True)
    summary_data = pd.concat([summary_data, summary], ignore_index=True)

summary_data.head() # Examine the first few rows of data to make sure it worked

And you can save this output using the following code:

output_dir = '/Users/researcher/project/output/' # change to your output path
output_filename = 'summary_data.csv'
output_csv_path = os.path.join(output_dir, output_filename)

summary_data.to_csv(output_csv_path, index = False)

Variations of the vocal_acoustics function

In the above code, the parameter option was set to ‘simple’. This will output some basic measures including F0 mean and variation, formants, pause characteristics, and cepstral measures. The majority of speech tasks will use this option.

Users also have the option to extract measures specifically related to assessing voice tremor or more advanced features including glottal measures. More details on which features are included in each of these options can be found here.

2 - Transcription and speech characteristics

To run the speech_characteristics function, it is necessary to first transcribe the audio file into a JSON file containing information about the transcript and timing of words.