Audio File Processing: ECG Audio Using Python

In this post, we will look into an application of audio file processing, for a good cause — Analysis of ECG Heart beat and write code in python.

By Taposh Dutta Roy, Kaiser Permanente

In my last post on “Basics of Audio File Processing in R” we talked about the fundamentals of audio processing and looked into some examples in R. In this post, we will look into an application of audio file processing, for a good cause — Analysis of ECG Heart beat and write code in python.

To understand this better, we will look into : Basic anatomy of the heart, measurements, origin and characteristics of heart sounds, techniques for heart sound analysis and python code for analyzing the sound. Our python code in this article will discuss how to read, and process data and develop a very simple model. In the next article, we will do more processing of data and develop a better model.

Further, note that I am not a clinician, thus all my knowledge is generated by reading papers, blogs and articles. I have listed all my sources and references. I will start with the location of heart in the thoracic cavity, as shown below

Source: Google



Basic Anatomy of Mammalian Heart



source : medmov


There are 5 basic anatomical areas of a mammalian heart :

  1. The four cardiac chambers
  2. The four cardiac valves
  3. The four layers of cardiac tissue
  4. The great heart vessels
  5. The natural cardiac pacemakers and the conduction system of the heart


Human Heart

The human heart is a four-chambered pump with two atria for collection of blood from the veins and two ventricles for pumping out the blood to the arteries.


Source : Research work of Dr. Amit Guy


The right side of the heart pumps blood to the pulmonary circulation (lungs), and the left side pumps blood to the systemic circulation (the rest of the body). The blood from the pulmonary circulation returns to the left atrium (through the pulmonary veins), and the blood from the systemic circulation returns to the right atrium (through the superior/inferior vena cava).

Two sets of valves control the flow of blood: the AV-valves (mitral and tricuspid) between the atria and the ventricles, and the semilunar valves (aortic and pulmonary) between the ventricles and the arteries.


Electrical Control


The periodic activity of the heart is controlled by an electrical conducting system. The electrical signal originates in specialized pacemaker cells in the right atrium (the sino-atria node), and is propagated through the atria to the AV-node (a delay junction) and to the ventricles. The electrical action potential excites the muscle cells and causes the mechanical contraction of the heart chambers.


Mechanical System : Systole & Diastole


The contraction phase of the ventricles is called systole. The ventricular systole is followed by a resting or filling phase that is called diastole. The mechanical activity of the heart includes blood flow, vibrations of the chamber walls and opening and closing of the valves.

The systole is sub-divided into :

The diastole is sub-divided into


Right side: vertical section of the cardiac muscle shows the internal structure of the heart. Left side: schematic representation of a reciprocating type pump having a pumping chamber and input output ports with oppositely oriented valves. (Source :



Modulating System

Human body is a complex system with a variety of sub-systems that work in tandem. Some of these systems that help our heart modulate are — Autonomous Nervous System, Hormonal System, Respiratory System. These along with electrical and mechanical factors make our heart work.

The autonomous nervous system regulates the heart rate: the sympathetic system enhances automaticity, while the parasympathetic system (vagus nerve) inhibits it. The nervous system also modulates the mechanical contractility of the heart chambers.

The hormonal system secretes hormones like insulin and epinephrine, which effect the contractility of the heart muscle.

The respiratory system causes periodic changes in the thoracic pressure, and thus effect the blood flow, venous pressure and venous return, triggering a reflex responses (baroreceptor reflex, bainbridge reflex) that modulates the heart rate. The heart rate is increased during inspiration and decreased during expiration.

Other mechanical factors are the peripheral resistance of the blood vessels, that can change due to internal or external factors (stenosis), the resulting venous return, the state of the valves (torn, calcified)

Other electrical factors are ectopic pacemaker cells, conduction problems, reentry circuits

The complexity and interaction of the system as depicted by Dr. Guy Amit is shown below -


Source: Research by Dr. Amit Guy



Heart Sounds



Four locations are most often used to listen to the heart sounds, which are named according to the positions where the valves can be best heard:

  • Aortic area — centered at the second right intercostal space.
  • Pulmonic area — in the second intercostal space along the left sternal border.
  • Tricuspid area — in the fourth intercostal space along the left sternal edge.
  • Mitral area — at the cardiac apex, in the fifth intercostal space on the midclavicular line.

The different types of heart sounds are as follows :

  1. S1 — onset of the ventricular contraction
  2. S2 — closure of the semilunar valves
  3. S3 — ventricular gallop
  4. S4 — atrial gallop
  5. EC — Systolic ejection click
  6. MC — Mid-systolic click
  7. OS — Diastolic sound or opening snap
  8. Murmurs

Fundamental heart sounds (FHSs) usually include the first (S1) and second (S2) heart sounds.

S1 occurs at the beginning of isovolumetric ventricular contraction, when the mitral and tricuspid valves close due to the rapid increase in pressure within the ventricles.

S2 occurs at the beginning of diastole with the closure of the aortic and pulmonic valves.

While the FHSs are the most recognizable sounds of the heart cycle, the mechanical activity of the heart may also cause other audible sounds, such as the third heart sound (S3), the fourth heart sound (S4), systolic ejection click (EC), mid-systolic click (MC), diastolic sound or opening snap (OS), as well as heart murmurs caused by the turbulent, high-velocity flow of blood.



We obtain the ECG data from Physionet challenge site’s 2016 challenge — Classification of Heart Sound Recordings. The goal for this challenge is to classify normal vs abnormal vs unclear heart sounds. For our purpose we will classify into 2 categories — normal and abnormal ( to make it easy for demonstration purpose)


Python Code

Similar to R, there are several libraries used to process audio data in python. In this code we will use the one of the libraries — librosa

  • librosa
  • scipy
  • wav

We will use librosa since we can use it for audio feature extraction as well.


Installing Libraries


Librosa returns the data and the sampling rate which is by default set to 22050, but you can change this or use raw sampling rate. Lets load a single audio file and look at the signal.

Next, reload all the training data-sets and create a complete training file. The code below explains on how to do this. We will also classify the data into normal and abnormal data.

Observe : the length of the files are different

One other thing to note here is the audio file duration. The first file was 20 sec while the 2nd one was 35 secs. There are two approaches to addressing this are:

1. Pad the audio with zeros to given length
2. Repeat audio to given length, for example max length of all audio samples

Also, note our library Librosa has the default sampling rate set to 22050 ( fyi, you can change this or use raw sampling rate). Please check the definition of Sampling rate and other details in the prior post : “Basics of Audio File Processing in R


Sampling & Setting all files to be of same length


Helper functions for Zero Padding and Repeating Audio


Get data for processing by a model (note: this process takes time)



All files of same length and zero padded


Lets review the wave_files, we see below we have a row for each file and a value for each of the 110250 columns. (Remember our audio length is 110250)

We are almost ready to get pass this data to an algorithm. We need to create our test and training data-sets. This is training the model with raw data and no feature engineering of any sort. Let’s see what we get.

We will use “adam” optimizer and binary_crossentropy, for details on these, please check the paper.


In the next article we will use some frequency strategies that we discussed in the initial article with python to improve to score.
Source code:


Bio: Taposh Dutta Roy leads Innovation Team of KPInsight at Kaiser Permanente. These are his thoughts based on his personal research. These thoughts and recommendations are not of Kaiser Permanente and Kaiser Permanente is not responsible for the content. If you have questions Mr. Dutta Roy can be reached via linkedin.



Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals (2003). Circulation. 101(23):e215-e220.

Basics of Audio File Processing in R
In today’s day and age, digital audio has been part and parcel of our life. One can talk to Siri or Alexa or “Ok…

Basic anatomy of the human heart - The Cardio Research Web Project
The heart has evolved in mammals to deliver its unique and crucial function of ejecting and collecting blood to and…

Basic anatomy of the human heart - The Cardio Research Web Project
The heart has evolved in mammals to deliver its unique and crucial function of ejecting and collecting blood to and…

McMaster University Department of Medicine >> Cardiology
The Division of Cardiology participates actively in undergraduate MD teaching and the training of Residents in…

LibROSA - librosa 0.7.2 documentation
LibROSA is a python package for music and audio analysis. It provides the building blocks necessary to create music…

Thoracic cavity | anatomy
Thoracic cavity, the second largest hollow space of the body. It is enclosed by the ribs, the vertebral column, and the…

Original. Reposted with permission.