Exocytosis is a complex process involving the regulated release of neurotransmitters from presynaptic neurons, and precise control of this process is crucial for neurotransmission. Synapsin and the SNARE (Soluble NSF Attachment Protein Receptor) complex are proteins that play significant roles in regulating exocytosis. Studies have demonstrated that synapsin modulates vesicle release by controlling the movement of vesicles to the active zone, where the SNARE complex facilitates vesicle fusion with the presynaptic membrane. Despite synapsin being the most abundant protein in neurons and both proteins interacting with synaptic vesicles, the role of synapsin in modulating SNARE dynamics remains unclear. In this investigation, we employed magnetic tweezers to probe the interaction between synapsin and the SNARE complex. By exerting controlled forces on individual SNARE complexes in the presence of synapsin, we observed that synapsin can impact the mechanical properties of SNARE, implying a potential role for synapsin in regulating neurotransmitter release through its effects on SNARE dynamics. These findings emphasize the importance of exploring synaspin-SNARE interactions in the nervous system and offer fresh insights into the role of synapsin in neuronal function.
The trans-activating CRISPR RNA (tracrRNA) is fundamental to the CRISPR/Cas9 system, forming guide RNA with crRNA. Despite its known importance in crRNA maturation and Cas9 RNP-mediated DNA cleavage, the exact function of tracrRNA scaffolds remains unclear. In this investigation, we generated five tracrRNA variants by removing specific scaffolds, including Stem loops 1, 2, and 3, and the Linker. Using a new single-molecule assay, we directly observed target binding and cleavage processes guided by Cas9 RNP. Our findings underscore the vital role of the Linker in initiating R-loops and highlight the significance of Stem loop 2 in identifying PAM-distal mismatches within target DNA. Furthermore, we explored cleavage efficiency by adding tracrRNA segments, indicating that maintaining the integrity of Stem loops 2 and 3 is crucial for potent Cas9 activity. We believe that these results deepen our understanding of Cas9 functionality and offer insights into its detailed mechanism from target binding to cleavage.
Audio-to-talking face generation stands at the forefront of advancements in generative AI. It bridges the gap between audio and visual representations by generating synchronized and realistic talking faces. This significantly improves human-computer interaction and content accessibility for diverse audiences. Despite substantial research in this area, critical challenges such as the lack of realistic facial animations, inaccurate audio-lip synchronization, and intensive computational demands continue to restrict the practicality of the talking face generation methods applications. To address these issues, we introduce a novel approach leveraging the emerging capabilities of Stable diffusion models and vision Transformers for Talking face generation (StableTalk). By incorporating the Re-attention mechanism and adversarial loss into StableTalk, we have markedly enhanced the audio-lip alignment and the consistency of facial animations across frames. More importantly, we have optimized computational efficiency by refining operations within the latent space and dynamically adjusting the visual focus based on the given conditions. Our experimental results demonstrate that StableTalk surpasses existing methods in terms of image quality, audio-lip synchronization, and computational efficiency.
The eukaryotic cell cycle, a pivotal biological process, has been extensively studied and
mathematically modelled in recent decades. Despite concerted efforts, identifying the minimal gene set essential for orderly cell cycle progression remains elusive. Synthetic biology, renowned for genetic engineering applications, also provides a pathway for addressing fundamental biological queries through “learning from building.” The Synthetic Yeast Genome (Sc2.0) project exemplifies this by synthesising Saccharomyces cerevisiae’s genome with changes that advance our understanding of eukaryotic genomes.
Expanding from Sc2.0’s groundwork, we aim to pioneer synthetic yeast genomes that are
minimal, modular, and reprogrammable. As a proof-of-concept, we constructed a synthetic
genome module housing nine of the key cell cycle genes. Employing CRISPR, we
systematically deleted these genes from their native loci and reinserted them together as a
synthetic gene cluster. While individually non-essential, the combined absence of all nine
genes renders this synthetic module indispensable.
Through Cre/loxP-mediated recombination, we investigated the gene combinations necessary for yeast cell cycle progression. Cre recombinase facilitated targeted gene deletions between intergenic loxP sites within the module, and rapidly generated diverse strains with combinatorial cluster deletion profiles, covering all potential combinations. Using flow cytometry sorting, we developed a way to isolate hundreds of viable deletion combinations and developed the Pool of Long Amplified Reads (POLAR) sequencing technique to enable the analysis of gene deletion frequency and gene content combinations for hundreds of strains with different cell cycle modules. These experimental findings were compared to computational models of the cell cycle and get us closer to understanding the minimal gene content for this function.
Upon pioneering this work, we now envisage a future where genome designers can predict
gene sets necessary for specialised tasks and can then synthetically arrange these genes on chromosomes and design intergenic regions to regulate their gene expression appropriately.
International commerce is a sphere where well-built customs rules are crucial. Nevertheless, due to existence of illegal acts and fraudulent undertakings, there is an urge for safety and economic soundness in customs controls. India’s customs service and related organizations employ artificial intelligence-based technologies that aid in combating illegal trade globally. The paper examines how AI can be used to identify people who misuse technology for illicit imports or exports. These evaluations also demonstrate how border control has become more dependent on AI, identify major concerns, and predict future trends. AI may provide an opportunity to strengthen border security as well as expedite legal business relations.
People quickly recognise human actions carried out in everyday activities. There is evidence that Minimal Recognisable Configurations (MIRCs) contain a combination of spatial and temporal visual features critical for reliable recognition. For complex activities, observers may have different descriptions varied in their semantic similarity (e.g., washing dishes vs cleaning dishes), potentially complicating the investigation of MIRCs in action recognition. Therefore, we measured the semantic consistency for 128 short videos of complex actions from the Epic-Kitchens-100 dataset (Damen et al., 2022), selected based on poor classification performance by our state-of-the-art computer vision network MOFO (Ahmadian et al., 2023). In an online experiment, participants viewed each video and identified the performed action by typing a description using 2-3 words (capturing action and object). Each video was classified by at least 30 participants (N=76 total). Semantic consistency of the responses was determined using a custom pipeline involving the sentence-BERT language model, which generated embedding vectors representing semantic properties of the responses. We then used adjusted pair-wise cosine similarities between response vectors to compute a ground truth description for each video, a response with the greatest semantic neighbourhood density (e.g., pouring oil, closing shelf). The greater the semantic neighbourhood density was for a ground truth candidate, the more semantically consistent were responses for the associated video. We uncovered 87 videos where semantic consistency confirmed their reliable recognisability, i.e. where cosine-similarity between the ground truth candidate and at least 70% of responses was above a similarity threshold of 0.65. We will use a subsample of these videos to investigate the role of MIRCs in human action recognition, e.g., gradually degrading the spatial and temporal information in videos and measuring the impact on action recognition. The derived semantic space and MIRCs will be used to revise MOFO into a more biologically consistent and better performing model.
Electrochemical potentials are essential for cellular life. For instance, cells generate and harness electrochemical gradients to drive a myriad of fundamental processes from nutrient uptake and ATP synthesis to neuronal transduction. To generate and maintain these gradients, all cellular membranes carefully regulate ionic fluxes using a broad array of transport proteins. For that reason, it is also extremely difficult to untangle specific ion transport pathways and link them to membrane potential variations in live cell studies. Conversely, synthetic membrane models, such as black lipid membranes and liposomes, are free of the structural complexity of cells and thus enable to isolate particular ion transport mechanisms and study them under tightly controlled conditions. Still, there is a lack of quantitative methods for correlating ionic fluxes to electrochemical gradient buildup in membrane models. Consequently, the use of these models as a tool for unravelling the coupling between ion transport and electrochemical gradients is limited. We developed a fluorescence-based approach for resolving the dynamic variation of membrane potential in response to ionic flux across giant unilamellar vesicles (GUVs). To gain maximal control over the size and membrane composition of these micron-sized liposomes, we developed an integrated microfluidic platform that is capable of high-throughput production and purification of monodispersed GUVs. By combining our microfluidic platform with quantitative fluorescence analysis, we determined the permeation rate of two biologically important electrolytes – protons (H+) and potassium ions (K+) – and were able to correlate their flux with electrochemical gradient accumulation across the lipid bilayer of single GUVs. Through applying similar analysis principles, we also determined the permeation rate of K+ across two archetypal ion channels, gramicidin A and outer membrane porin F (OmpF). We then showed that the translocation rate of H+ across gramicidin A is four orders of magnitude higher than that of K+ unlike in the case of OmpF where similar transport rates were evaluated for both ions.
This research represents a groundbreaking approach in plant phenotyping by harnessing 3D point clouds generated from video data. Focusing on the comprehensive characterization of plant traits, this method enhances the precision and depth of phenotypic analysis, crucial for advancements in genetics, breeding, and agricultural practices.
Advanced Video Data Capture and Processing for Detailed Segmentation
High-Fidelity Video Acquisition: Capturing detailed video footage of plants under varying environmental conditions forms the foundation of this method. The use of high-resolution cameras allows for capturing minute details crucial for accurate part segmentation.
Rigorous Preprocessing for Optimal Data Quality: Following capture, the video data undergoes meticulous preprocessing. Stabilization, noise filtering, and color correction are performed to ensure that the subsequent segmentation algorithms can accurately identify different parts of the plant.
Segmentation and 3D Point Cloud Generation: The application of state-of-the-art image processing algorithms segments the plant parts within each video frame. Subsequently, photogrammetry and depth estimation techniques create detailed 3D point clouds, effectively capturing the geometry of individual plant components.
Part Segmentation and Trait Measurement for Enhanced Phenotyping
Precise Plant Part Segmentation: This methodology enables the accurate segmentation of individual plant parts, such as leaves, stems, and flowers, within the 3D space. This precise segmentation is crucial for assessing complex plant traits and understanding plant structure in its entirety.
Comprehensive Trait Measurement: The 3D point clouds facilitate comprehensive measurements of plant traits. This includes quantifying leaf area, stem thickness, flower size, and even more subtle features like leaf venation patterns, providing a multi-dimensional view of plant phenotypic traits.
Temporal Tracking for Dynamic Trait Analysis: An integral advantage of using video data is the ability to track and measure these traits over time. This dynamic analysis allows for monitoring growth patterns, developmental changes, and responses to environmental stimuli in a way that static images cannot achieve.
Conclusion: A Breakthrough in Plant Phenotyping and Agricultural Research
This research significantly enhances the capability for detailed plant part segmentation and trait measurement, setting a new standard in plant phenotyping. The level of detail and accuracy afforded by this method offers invaluable insights for agricultural technology, plant genetics, and breeding programs. It represents a critical step forward in our ability to understand and optimize plant characteristics, with far-reaching implications for food production and ecological sustainability.
Our rich, embodied visual experiences of the world involve integrating information from multiple sensory modalities – yet how the brain brings together multiple sensory reference frames to generate such experiences remains unclear. Recently, it has been demonstrated that BOLD fluctuations throughout the brain can be explained as a function of the activation pattern on the primary visual cortex (V1) topographic map. This class of ‘connective field’ models allow us to project V1’s map of visual space into the rest of the brain and discover previously unknown visual organization. Here, we extend this powerful principle to incorporate both visual and somatosensory topographies by explaining BOLD responses during naturalistic movie-watching as a function of two spatial patterns (Connective fields) on the surfaces of V1 and S1. We show that responses in the higher levels of the visual hierarchy are characterized by multimodal topographic connectivity: these responses can be explained as a function of spatially specific activation patterns on both the retinotopic and somatosensory homunculus topographies, indicating that somatosensory cortex participates in naturalistic vision. These novel multimodal tuning profiles are in line with known visual category selectivity, for example for faces and manipulable objects. Our findings demonstrate a scale and granularity of multisensory tuning far more extensive than previously assumed. When inspecting their topographic tuning in S1, we find a full band extrastriate visual cortex from retrosplenial, laterally to the fusiform gyrus, is tiled with somatosensory homunculi. These results demonstrate the intimate integration of information about visual coordinates and body parts in the brain that likely supports visually guided movements and our rich, embodied experience of the world. Finally, we present initial data from a new, densely sampled 7T fMRI movie-watching dataset optimised to shed light on the brain basis of human action understanding.
We do not notice everything in front of us, due to our limited attention capacity. What we attend to forms our conscious experience and is what we retain over time. Thus, creative content creators must strive to direct your attention in different media, from cinema to computer games. To do this they have developed various techniques that involve either directly using centrally presented cues such as arrows or instructions to move attention or rely on image features or so- called “bottom-up” cues that involve manipulating the salience of the parts of an image. Shifting attention usually involves moving our central vision around a screen, but this problem becomes more pronounced in virtual environments where users are free to explore by moving in any direction through it. This can be seen in first- person view screen- based computer video games. Such an experience allows the user to choose how they sample their environment. Often the designer of the environment wishes the user to interact and view certain parts of the scene. In this study we test out a subtle manipulation of visual attention through varying depth of field. Varying depth of field is a cinematic technique that can be implemented in virtual worlds and involves keeping parts of the scene in focus whilst blurring other parts of the scene. We use eye tracking to investigate this technique in a 3D game environment, rendered on a monitor screen. Participants navigated through the environment using keyboard keys and began by freely exploring in the first part and in the second part were instructed to find a target object. We manipulated whether the frames were rendered fully in focus (termed a deep depth of field) or whether a shallow depth of field was applied (where the outer edges of the scene appear blurred. We measured where on the screen participants looked. We divided the screen into 3×3 equal sized regions and calculated the proportion of the time participants spent looking in the central square. On average across all trials participants spent 67% of their fixation time on the central area of the screen. This means that they preferred to navigate by looking in the direction they were heading in. We found that there was a significant difference when freely exploring the scene – participants spent more time looking in the centre of the screen when a shallow depth of field was applied than with a deep depth of field. This was no longer the case during the search task. We demonstrate how these techniques might be effective for manipulating attention by keeping user’s eyes looking straight ahead when they are freely exploring a virtual environment.
People are increasingly consuming video media through the internet. This can lead to a mis-match between the auditory and visual streams due to internet connectivity. For instance in a video of a news anchor reporting a story, there can be time lag between the spoken words and the corresponding movements of the anchor’s mouth and lips (as well as body gestures). This asynchrony between the auditory and visual streams can also arise due to various physical, bio-physical and neural mechanisms, but people are often not aware of these differences. There is accumulating evidence that people adapt to auditory-visual asynchrony at different time scales and for different stimulus categories. However, previous studies often used very simple auditory-visual stimuli (e.g., flashing lights paired with brief tones) or they used short videos of a few seconds. In the current study, we investigated the temporal adaption of continuous speech presented in longer videos. Speech is one of the strongest case for auditory-visual integration, as demonstrated by multi-sensory illusions like the McGurk-McDonald and Ventriloquist effects. To measure temporal adaption of speech videos, we developed a continuous-judgment paradigm in which participants continuously judge over several tens of seconds whether an auditory-visual speech stimulus is synchronous or not. The stimuli consisted of 40 videos (duration: M = 63.3s, SD = 10.3s). For each video, we filmed a close-up (upper body) of one male and one female speaker reporting a news story transcribed from a real news clip (e.g., about the Brexit vote outcome or about Boris Johnson’s resignation). Each speaker reported 20 news stories. We then created seven asynchronous versions of each video by shifting the relative stimulus onset asynchrony (SOA) between the auditory and visual streams between -240ms (auditory stream leading) to +240ms (visual stream leading) in 80ms steps. This included SOA = 0ms (i.e., the original synchronous video). The first 5-10s of all videos were synchronous. For each participant in the continuous-judgment task, we randomly selected 10 videos at each SOA (70 total). Participants continuously judged the synchrony of each video by pressing/releasing the spacebar throughout the duration of the video (response sampling rate = 33ms). The mean proportion perceived synchrony across the duration of the videos were calculated from participants’ continuous responses after the initial synchronous period. For the auditory-leading videos (SOAs 0ms), participants initially showed a drop in proportion perceived synchrony but this proportion increased over time, suggesting that they were adapting to the asynchrony over time. The magnitude of temporal adaptation depended on the SOA, with the largest SOA producing the largest adaptation. Consistent with previous studies, our findings suggest that temporal adaptation occurs for long, continuous speech videos but only when the visual stream leads the auditory stream.
In the evolving landscape of traffic management and autonomous driving technology, the analysis of traffic scenes from video data stands as a crucial challenge. Traditional approaches often rely on complex, high-dimensional image analysis, necessitating significant computational resources and sophisticated algorithms. Recognizing the limitations of these methods, our research introduces a novel, streamlined approach centered around a graph-based framework for understanding traffic dynamics.
Central to our methodology is the exploration of complex scene analysis through the lens of object-object interaction within traffic scenes. This interaction dynamics is adeptly captured through our specially designed graph structures, which are further analyzed and interpreted using Graph Neural Networks (GNNs) as a foundational element. By employing GNNs, our framework delves into the intricate dynamics of traffic environments. We focus on the high-level interactions and behaviours within traffic scenes, distilling the essential patterns of movement and relationships among elements such as vehicles and pedestrians.
To validate the effectiveness of our framework, we conducted extensive testing using two prominent datasets: the METEOR Dataset and the INTERACTION Dataset. Our methodology demonstrated exceptional performance, achieving an accuracy of 62.03% on the METEOR Dataset and an impressive 98.50% on the INTERACTION Dataset. These results underscore the capability of our graph-based approach to accurately interpret and analyze the dynamics of traffic scenes.
Through this rigorous evaluation, our research not only showcases the significant advantages of incorporating graph neural networks for traffic scene analysis but also highlights the power of our novel approach in abstracting and understanding the complex patterns of movement and interactions within traffic environments. Our work sets a new benchmark in the field, offering a promising direction for future advancements in traffic management and autonomous vehicle technologies.
In the field of person re-identification (re-ID), accurately matching individuals across different camera views poses significant challenges due to variations in pose, illumination, viewpoint, and notably, scale. Traditional methods in re-ID have focused on robust feature descriptor generation and sophisticated metric learning, yet they often fall short in addressing scale variations effectively. In this work, we introduce a novel approach to scale-invariant person re-ID through the development of our scale-invariant residual networks coupled with an innovative batch adaptive triplet loss function for enhanced deep metric learning. The first network, termed Scale-Invariant Triplet Network (SI-TriNet), leverages pre-trained weights to form a deeper architecture, while the second, Scale-Invariant Siamese Resnet-32 (SISR-32), is a shallower structure trained from scratch. These networks are adept at handling scale variations, a common yet challenging aspect in re-ID tasks, by employing scale-invariant (SI) convolution techniques that ensure robust feature detection across multiple scales. This is complemented by our proposed batch adaptive triplet loss function that refines the metric learning process, dynamically prioritizing learning from harder positive samples to improve the model’s discriminatory capacity. Extensive evaluation on benchmark datasets Market-1501 and CUHK03 demonstrates the superiority of our proposed methods over existing state-of-the-art approaches. Notably, SI-TriNet and SISR-32 show significant improvements in both mean Average Precision (mAP) and rank-1 accuracy metrics, affirming the efficacy of our scale-invariant architectures and the novel loss function in addressing the complexities of person re-ID. This study not only advances the understanding of scale-invariant feature learning in deep networks but also sets a new benchmark in the person re-ID domain, promising more accurate and scalable solutions for real-world surveillance and security applications.
As humans move around, performing their daily tasks, they are able to recall where they have positioned objects in their environment, even if these objects are currently out of sight. In this paper, we aim to mimic this spatial cognition ability. We thus formulate the task of Out of Sight, Not Out of Mind – 3D tracking active objects using observations captured through an egocentric camera. We introduce Lift, Match and Keep (LMK), a method which lifts partial 2D observations to 3D world coordinates, matches them over time using visual appearance, 3D location and interactions to form object tracks, and keeps these object tracks even when they go out-of-view of the camera – hence keeping in mind what is out of sight.
We test LMK on 100 long videos from EPIC-KITCHENS. Our results demonstrate that spatial cognition is critical for correctly locating objects over short and long time scales. E.g., for one long egocentric video, we estimate the 3D location of 50 active objects. Of these, 60% can be correctly positioned in 3D after 2 minutes of leaving the camera view.
The interpretation of social interactions between people is important in many daily situations. In this talk, we will present the results of 2 studies examining the visual perception of other people interacting. The first study used functional brain imaging to investigate the brain regions involved in the incidental visual processing of social interactions; that is, the processing of the body movements outside the observers’ focus of attention. The second study used a visual search paradigm to test whether people are better able to find interacting than non-interacting people in a crowd.
In the first study, we measured brain activation while participants (N = 33) were presented with point-light dyads portraying communicative interactions or individual actions. These types of stimuli allowed us to investigate the role of motion in processing social interactions by removing form cues. Participants in our study discriminated the brightness of two crosses also on the screen, thus excluding the body movements from the participant’s task-related focus of attention. To investigate brain regions that may process the spatial and temporal relationships between the point-light displays, we either reversed the facing direction of one agent or spatially scrambled the local motion of the points. Incidental processing of communicative interactions elicited activation in right anterior STS only when the two agents were facing each other. Controlling for differences in local motion by subtracting brain activation to scrambled versions of the point-light displays revealed significant activation in parietal cortex for communicative interactions, as well as left amygdala and brain stem/cerebellum. Our results complement previous studies and suggest that additional brain regions may be recruited to incidentally process the spatial and temporal contingencies that distinguish people acting together from people acting individually.
Our second study focussed on deliberate visual processing of communicative interactions in the observer’s focus of attention. Participants viewed arrays of the same point-light dyads used in our first study, but here they searched for an interacting dyad amongst a set of independently acting dyads, or for an independently acting dyad amongst a set of interacting dyads, by judging whether a target dyad was present or absent (targets were present on half the trials). In each of two experiments (N=32 and N=49), participants were faster and more accurate to detect the presence of interacting than independently acting target dyads. Moreover, visual search for interacting target dyads was more efficient than for independently acting target dyads, as indicated by shallower search slopes (increase in response time with increasing number of distractors) for the former as for the latter. In the second experiment, we measured the eye movements of the participants using an eye tracker. The analyses of the eye tracking data are ongoing. Based on the results from our first study and on search performance, we expect that fixation duration on communicative-dyad targets will be shorter than on independent-dyad targets, because less attentional focus (as measured by fixation duration) is needed to process social interactions.
James Madison Carpenter (1888-1984) was a Harvard-trained scholar who recorded traditional singing in Britain and, to a lesser extent, in his native United States, in the period c.1928-40. His extensive collection includes 179 Dictaphone cylinders, totalling some 35 hours of examples (https://hdl.loc.gov/loc.afc/eadafc.af010002). Although he worked on the collection throughout his life, Carpenter’s hopes for publication were never realized and he eventually sold it to the Library of Congress. It has since been digitized and is now available online via the Vaughan Williams Memorial Library (https://www.vwml.org/archives-catalogue/JMC), facilitated though the work of an ongoing project by a team of UK- and US-based scholars towards surfacing the collection in a critical edition (https://www.abdn.ac.uk/elphinstone/carpenter/).
Carpenter was the first to consistently use a recording device in British folk song collecting. The cylinders thus have the potential to provide insights into traditional performance style and the comparative study of folk song melodies, as well as providing a relatively untapped source of repertoire for contemporary performers.
The quality of the sound recordings is, however, disappointingly poor, making Carpenter’s own transcriptions an essential complement. A self-trained musician, he found the process laborious, taking inspiration from the dictum that ‘a wrestler with sounds is a wrestler with shadows’.
This paper focuses on Carpenter’s approach to recording and evaluates his transcriptions, as well as outlining the issues involved in producing new ones. As we continue to wrestle with these historic sounds today, what is their potential for scholarship and performance, and what is the role of transcription in this context?
Music from Bosnia and Herzegovina was first recorded on cylinders during the Paris Exhibition in 1900. In 1907/1908 a series of recordings were made in Sarajevo and soon after commercially published as a part of the Zonophone series by Grammophone Company. Even though most of these recordings have been lost to history with the First World War, they provide us with an important glimpse into the state of Bosnian music scene in the beginning of the 20th Century. This paper offers an overview of the local perception of these recordings in local press, literature and subsequent repertoire of musicians in Sarajevo and wider region.
“A new ally has come to the cause in the form of an Edison Phonograph, whose function it is to preserve the Manx sounds as uttered by native speakers.” This is taken from the 1905 Annual Report of the Manx Language Society. The key figure here was Sophia Morrison, a Pan Celtic activist and leader of the Celtic Revival in the Isle of Man which started in the 1890s. The Island was also to see a visit in 1907 and 1909 by Rudolf Trebitsch on behalf of the Phonogrammarchive of Vienna and Berlin. The MLS were not the first to use the phonograph in the Island but the first to be systematic in its use. Whilst the phonograph survives and the cylinder recordings made are in large part lost, there is useful detail as to how the programme was organised and implemented which will be subject of the presentation.
Since the early 2000s, I have made phonographic recordings on cylinders of traditional musics at music festivals, artistic residencies and for public demonstrations of acoustic sound recording. Musicians from Spain, Senegal, Nigeria and South Korea, among other countries, have played and sung into my recording horn and their recordings reproduced to them mechanically on an ‘Edison’ phonograph. Their responses may arguably be compared to those experienced by the earliest phonographic recording subjects over one hundred years earlier – the awe of hearing a sound recording reproduced for the first time is replaced in this instance by feelings of curiosity, delight and amazement in the process of acoustic sound recording.
In the manner of early ethnographic practices, the recordings were all made ‘in the field’, often in difficult-to-record locations such as in the open air. Similarly, the recording apparatus used included a domestic Edison phonograph, such as those employed by ethnographers in the early 1900s, along with recorders and recording horns based on original examples. In this way, the entire recording set-up or dispositif, may be seen as a media-archaeological reenactment of past ethnographical recording practices. The results were likewise documented and archived as well as being disseminated through digital channels.
My presentation will focus on phonographic recordings on cylinders and documentation made at the annual Sinsal SON music festival on the island of San Simón in Galicia where since 2018, I have contributed to an ever-expanding archive of cylinder recordings that includes a variety of traditional musics from five continents.
In the summers of 1916 and 1917, the young folk music researcher Armas Otto Väisänen travelled through the remote villages of Border Karelia with a phonograph and a small hollow kantele from a museum. He aimed to collect the laments, kantele tunes and shepherd’s songs that were so scarce in the archives at the time. The old kantele tradition connected with the ancient runosong culture was disappearing, and by bringing an old instrument, Väisänen gave those who didn’t even have a kantele of their own the opportunity to play what they could remember. Supplemented by a short trip to Olonets Karelia in 1919 and later meetings with some individual tradition-bearers, Väisänen made detailed observations about the scales used by kantele players, the old playing technique and, above all, the special aesthetics of the music within the ancient tradition.
The copies of the wax cylinders are kept in the archives of the Finnish Literature Society in Helsinki. They present many challenges to modern listeners as it was not possible to get the instrument close enough to the phonograph horn, and thus it was not able to record all of the highest and lowest sounds. Also, the extremely poor quality of the surviving copies and their short examples, mostly of the dance tunes of the time, don’t open up the hours-long trance-like improvisation described by Väisänen and other scholars of the late 19th and early 20th centuries. But they do provide insights for musicians seeking new ways of performing this ancient music.