UMAP projection plane audio of 2189_p_m_52_1978_a_n audio of 975_n_w_19_1369_a_n audio of 1635_p_w_33_1263_a_n audio of 936_p_w_20_1471_a_n audio of 2336_p_w_63_2533_a_n audio of 2600_p_w_45_2466_a_n audio of 614_n_w_20_686_a_n audio of 820_n_m_40_1017_a_n audio of 1591_p_m_59_1195_a_n audio of 1479_p_w_53_885_a_n audio of 1594_p_m_69_1197_a_n audio of 2524_p_m_72_2390_a_n audio of 2220_p_w_49_2012_a_n audio of 1932_p_m_56_1638_a_n audio of 24_n_w_26_23_a_n audio of 1043_n_m_22_1702_a_n audio of 1998_p_m_48_1720_a_n audio of 2069_p_w_71_1803_a_n audio of 108_n_m_68_92_a_n audio of 56_n_w_22_56_a_n audio of 1367_p_w_43_496_a_n audio of 112_n_m_69_96_a_n audio of 2379_p_w_52_2218_a_n audio of 1802_p_w_62_1454_a_n audio of 2553_p_m_65_2419_a_n audio of 2657_p_w_66_2523_a_n audio of 1017_n_w_25_1536_a_n audio of 1826_p_m_77_1478_a_n audio of 2000_p_m_57_1749_a_n audio of 1021_n_w_25_1540_a_n audio of 2679_p_w_66_2545_a_n audio of 1446_p_m_69_2362_a_n audio of 2644_p_w_39_2510_a_n audio of 812_n_m_28_1011_a_n audio of 784_n_m_34_987_a_n audio of 936_n_w_19_1207_a_n audio of 795_n_m_28_995_a_n audio of 73_p_w_73_148_a_n audio of 2222_p_w_52_2014_a_n audio of 2694_p_w_71_2560_a_n audio of 2276_p_w_38_2085_a_n audio of 1627_p_w_54_1240_a_n audio of 2012_p_m_68_1747_a_n audio of 22_n_w_20_21_a_n audio of 996_n_w_20_1513_a_n audio of 1442_p_m_29_830_a_n audio of 2691_p_m_52_2555_a_n audio of 618_n_m_21_690_a_n audio of 1055_n_w_20_1740_a_n audio of 2432_p_w_52_2298_a_n audio of 2103_p_w_57_1836_a_n audio of 1726_p_w_22_1378_a_n audio of 2579_p_w_53_2445_a_n audio of 2086_p_w_26_1822_a_n audio of 1071_n_w_20_1838_a_n audio of 878_n_w_37_1105_a_n audio of 1174_n_m_22_2202_a_n audio of 2067_p_w_57_1802_a_n audio of 2098_p_w_55_1831_a_n audio of 1014_n_w_21_1533_a_n audio of 814_n_m_44_1013_a_n audio of 1313_p_m_47_128_a_n audio of 710_n_m_19_806_a_n audio of 2595_p_m_37_2461_a_n audio of 1482_p_w_43_888_a_n audio of 2484_p_w_34_2350_a_n audio of 1469_p_m_54_1771_a_n audio of 789_n_m_40_991_a_n audio of 2056_p_w_29_1792_a_n audio of 1743_p_w_26_1395_a_n audio of 1489_p_m_73_894_a_n audio of 2386_p_w_50_2225_a_n audio of 2550_p_m_63_2417_a_n audio of 2124_p_w_52_1900_a_n audio of 1984_p_w_51_1685_a_n audio of 902_n_m_20_1140_a_n audio of 2091_p_w_57_1835_a_n audio of 1047_n_m_24_1706_a_n audio of 2482_p_m_44_2348_a_n audio of 2400_p_m_51_2239_a_n audio of 125_n_w_22_115_a_n audio of 628_n_w_22_703_a_n audio of 2353_p_m_33_2156_a_n audio of 2233_p_w_71_2024_a_n audio of 1639_p_w_43_1253_a_n audio of 1352_p_w_23_367_a_n audio of 1447_p_w_50_848_a_n audio of 905_n_m_21_1142_a_n audio of 1867_p_m_45_1560_a_n audio of 976_n_m_23_1377_a_n audio of 816_n_m_37_1015_a_n audio of 2472_p_w_24_2338_a_n audio of 2500_p_m_49_2365_a_n audio of 1105_n_w_20_1880_a_n audio of 1039_n_w_21_1712_a_n audio of 1495_p_m_68_918_a_n audio of 1016_n_w_22_1535_a_n audio of 1327_p_w_22_149_a_n audio of 1131_n_w_21_2039_a_n audio of 1387_p_w_76_612_a_n

The Chrome or Firefox is recommended to use.

Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection

Jianwei Zhang, Julie Liss, Suren Jayasuriya, and Visar Berisha

Supplemental - Interactive UMAP Projection of SVD In-corpus Validation Dataset


Introduction
This website provides an interactive version of the SVD in-corpus validation dataset UMAP projection. This material contains the raw audio of some subjects and can be played by clicking the points on the figure. In general, voices located in the top-right corner present with marked dysphonia. Voice samples located on the top-left and bottom-right present with healthy voice quality or mild dysphonia. We indicate two axes on the embeddings distribution: (1) from the bottom-left to the top-right and (2) from the bottom-left to the top-left, which are related to vocal quality and voice pitch respectively. These two axes are also consistent with our training design: (1) a contrastive loss is used to separate dysphonic and healthy voice, and (2) samples either from male or female subjects are selected in one training batch. This result further illustrates the proposed acoustic feature embeddings are sensitive to the vocal quality and voice characteristics.



Usage
When the mouse pointer hovers over the target point, the icon of that point will be enlarged for easily recognize. Then click the left mouse button to hear the /a/ phonation voice audio corresponding to the target point.