Genome Report | Genome Report
Microbiol. Biotechnol. Lett. 2023; 51(4): 551-554
https://doi.org/10.48022/mbl.2311.11004
Parthiban Subramanian1, Jeong-Seon Kim2, Jun Heo2, and Yiseul Kim2*
1National Agrobiodiversity Center, National Institute of Agricultural Sciences, Rural Development Administration, Wanju 55365, Republic of Korea
2Agricultural Microbiology Division, National Institute of Agricultural Sciences, Rural Development Administration, Wanju 55365, Republic of Korea
Correspondence to :
Yiseul Kim, dew@korea.kr
We report the draft genome sequence of Sporobolomyces phaffii RJAF-17, a basidiomycetous yeast strain producing lipoamino acid surfactants, N-palmitoyl leucine and N-parmitoleyl glutamine. The annotation and classification of protein-coding genes provided the basic information for the genome of strain RJAF-17, including prediction of abundant genes as well as detection of genes involved in the biosynthesis of lipoamino acids. With the molecular importance of lipoamino acids as promising alternatives to chemical surfactants, the genomic information of strain RJAF-17 can help us understand the role of biomolecules in yeasts and explore possibilities of large-scale synthesis for industrial applications.
Keywords: Yeast, Sporobolomyces phaffii, biosurfactant, genome
In response to high demand for environmental and industrial applications of microorganisms, efforts have been made to explore microorganisms with biotechnological potential. Among several useful products derived from microorganisms, biosurfactants have a significant importance as they are widely used around the world for a variety of purposes. According to a recent survey, the global market size of biosurfactants is projected to reach a compound annual growth rate of 5.5% from 2020 to 2026, while the market size of chemical surfactants is estimated to grow at around 5.3% from 2020 to 2027 (https://www.alliedmarketresearch.com). Adverse effects of climate change and surge of overall pollution urge the need to look for environmentally friendly biosurfactants for everyday applications. With this in mind, wild flowers were collected in Gwangyang-si, Jeollanamdo, Republic of Korea for isolation of yeast strains, of which
The genome of strain RJAF-17 was sequenced using a combination of HiFi sequencing (Sequel II System) and Illumina HiSeq X-ten (Illumina, USA) platforms provided by Macrogen, Republic of Korea. High quality
All further analyses were carried out on the Galaxy Web platform (https://usegalaxy.org). Assessments of assembly completeness and repetitive elements were conducted with BUSCO 4.1.4 [3] using the dothideomycetes_ odb10 lineage dataset and with RepeatMasker 4.1.5., respectively. For quality assessment of gene prediction, annotations of strain RJAF-17 were performed using AUGUSTUS and MAKER software with
Table 1 . Statistics of gene prediction using the different programs.
Attribute | AUGUSTUS | MAKER |
---|---|---|
Contigs | 13 | 13 |
Number of genes predicted | 5,861 | 7,864 |
Number of transcripts predicted | 5,861 | 7,864 |
Complete BUSCOs | 1,468 | 1,658 |
Missing BUSCOs | 212 | 88 |
Number of selected queries by | 4,362 | 6,015 |
EggNOG-mapper | (74.4%) | (76.4%) |
Pfam hits* | 4,086 | 5,618 |
GO hits* | 2,656 | 3,582 |
EC hits* | 1,237 | 1,680 |
CAZy hits* | 80 | 117 |
*Number of predicted genes that contain at least one Pfam domain, one GO term, one enzyme, and one CAZy hit.
The genome assembly was 19.7 Mb (19,710,391 bp) in size and consisted of 13 scaffolds. The largest scaffold was 3,769,252 bp long and the shortest was 463,633 bp long with N50 value of 1.9 Mb (1,875,694 bp). The GC content was estimated to be 57.5%. Completeness of the genome assembly was 92.2%. Analysis of repetitive elements exhibited very few repeat elements in the genome (2.86%). The predicted genes using AUGUSTUS (5,861 genes) and MAKER (7,864 genes) were submitted to eggNOG-mapper for scanning (Table 1). Although there is currently no ANI threshold range for yeast species demarcation, calculation of ANI was performed between strain RJAF-17 and three available genomes of the genus
Functional annotation using outputs from both AUGUSTUS and MAKER elucidated functional traits of the coding sequences (Fig. 1). In terms of COG categories, strain RJAF-17 contained many genes related with “post-translational modification, protein turnover, chaperones” followed by “translation, ribosomal structure and biogenesis”. Analysis using fungal antiSMASH resulted in identification of non-ribosomal peptide synthetase (NRPS domain, contigs 1 and 3), terpene synthesis domain (contigs 2 and 4), and betalactone synthesis domain (contig 2) (Table 2). As mentioned earlier, strain RJAF-17 was observed to produce surfactant molecules namely N-palmitoyl leucine and N-parmitoleyl glutamine, which are categorized as lipoamino acids (patent application number 10-2023-0149087). These lipoamino acids have been well established for their surfactant property [7, 8]. Formed by the association of a polar amino acid and a non-polar long-chain compound, these molecules have high surface activity resulting in surfactant characteristics. In bacteria, this acylation reaction is reported to be carried out by two enzymes, an Nacetyltransferase and an O-acetyltransferase of which the former catalyzes the initial conjugation of the amino acid to a beta hydroxy fatty acid followed by conjugation of a second fatty acid to the lysolipid by the later [9]. In this study, we found five genes for N-acetyltransferase in the annotated genome of strain RJAF-17 but not for O-acetyltransferases. As
Table 2 . Secondary metabolite gene clusters determined by the fungal version of antiSMASH.
Contig | Region | Type | From | To | Most similar known clusters | Similarity |
---|---|---|---|---|---|---|
Contig 1 | 1.1 | NRPS | 1,464,762 | 1,510,798 | Nonribosomal peptide synthase of yeast | 54.9% |
Contig 2 | 2.1 | Terpene | 56,421 | 78,302 | Hypothetical protein from | 70.3% |
2.2 | Betalactone | 1,585,039 | 1,617,996 | 6-Coumarate-CoA ligase of | 68.3% | |
Contig 3 | 3.1 | NRPS-like | 438,470 | 483,419 | L-Aminoadipate-semialdehyde dehydrogenase of | 81.3% |
Contig 4 | 4.1 | Terpene | 75,068 | 102,257 | Squalene cyclase (SQCY) domain gene of | 66.0% |
4.2 | Terpene | 572,343 | 593,838 | Lycopene cyclase/phytoene synthase of | 71.3% |
This study (Project No. 015675) was carried out with the support of National Institute of Agricultural Sciences, Rural Development Administration, Republic of Korea.
The authors have no financial conflicts of interest to declare.
Parthiban Subramanian, Jeong-Seon Kim, Jun Heo, and Yiseul Kim
Microbiol. Biotechnol. Lett. 2023; 51(4): 531-534 https://doi.org/10.48022/mbl.2310.10004Jeong-Seon Kim, Parthiban Subramanian, Seunghwan Kim, Jun Heo, Bong-Sik Yun, and Yiseul Kim
Microbiol. Biotechnol. Lett. 2023; 51(3): 328-331 https://doi.org/10.48022/mbl.2307.07002Du-Gyeong Han, Ji-A Jeong, Sung-Kyoung Lee, Seong-Han Kim, and Se-Mi Jeon
Microbiol. Biotechnol. Lett. 2024; 52(2): 211-214 https://doi.org/10.48022/mbl.2403.03005