BCO-DMO ERDDAP
Accessing BCO-DMO data
log in    
Brought to you by BCO-DMO    

ERDDAP > info > bcodmo_dataset_665311

Grid
DAP
Data
Sub-
set
Table
DAP
Data
Make
A
Graph
W
M
S
Source
Data
Files
Acces-
sible
?
Title Sum-
mary
ISO,
Metadata
Back-
ground
Info
RSS E
mail
Institution Dataset ID
     data   graph     files  public Transcriptome statistics from samples obtained on LMG1411 collected on the Gould (LMG1411) in
the Western Antarctica Peninsula in 2014. (Polar Transcriptomes project)
   ?     I   M   background (external link) RSS Subscribe BCO-DMO bcodmo_dataset_665311

The Dataset's Variables and Attributes

Row Type Variable Name Attribute Name Data Type Value
attribute NC_GLOBAL access_formats String .htmlTable,.csv,.json,.mat,.nc,.tsv
attribute NC_GLOBAL acquisition_description String Nine species of diatoms were isolated from the Western Antarctic Peninsula
along the PalmerLTER sampling grid in 2013 and 2014. Isolations were performed
using an Olympus CKX41 inverted microscope by single cell isolation with a
micropipette (Anderson 2005). Diatom species were identified by morphological
characterization and 18S rRNA gene (rDNA) sequencing. DNA was extracted with
the DNeasy Plant Mini Kit according to the manufacturer\u2019s protocols
(Qiagen). Amplification of the nuclear 18S rDNA region was achieved with
standard PCR protocols using eukaryotic-specific, universal 18S forward and
reverse primers. Primer sequences were obtained from Medlin et al. (1982). The
length of the region amplified is approximately 1800 base pairs (bp
).\u00a0Pseudo-nitzschia\u00a0species are often difficult to identify by their
18S rDNA sequence, therefore, additional support of the taxonomic
identification of\u00a0P.\u00a0subcurvata\u00a0was provided through sequencing
of the 18S-ITS1-5.8S regions. Amplification of this region was performed with
the 18SF-euk and 5.8SR_euk primers of Hubbard et al. (2008). PCR products were
purified using either QIAquick PCR Purification Kit (Qiagen) or ExoSAP-IT
(Affymetrix) and sequenced by Sanger DNA sequencing (Genewiz). Sequences were
edited using Geneious Pro software
([http://www.geneious.com](\\"http://www.geneious.com\\"), Kearse et al.,
2012) and BLASTn sequence homology searches were performed against the NCBI
nucleotide non-redundant (nr) database to determine species with a cutoff
identity of 98%.

Diatom phylogenetic analysis was performed with Geneious Pro and included 71
additional diatom 18S rDNA sequences from publically available genomes and
transcriptomes, including those in the MMETSP database. Diatom sequences were
trimmed to the same length and aligned with MUSCLE (Edgar 2004). A
phylogenetic tree was created in Mega with the Maximum-likelihood method of
tree reconstruction, the Jukes-Cantor genetic distance model (Jukes and Cantor
1969), and 100 bootstrap replicates.

Illumina TruSeq adapters and poly-A tails were trimmed from raw reads using
the Fastx_toolkit clipper function. Fastq_quality_filter was used to remove
poor quality sequences, such that remaining sequences had a minimum quality
score of 20 with a minimum of 80% of bases within a\u00a0read\u00a0meeting
this quality score requirement. Any remaining raw sequences less than 50 base
pairs in length were also removed. Merged files were assembled\u00a0de
novo\u00a0using Trinity (Grabherr et al. 2011). The resulting assembly was
filtered to remove contigs less than 200 bp in length. Trinity-assembled
contigs which exhibited sequence overlap were grouped into isogroups which
were then used for sequence homology searches (BLASTx E-value \u2264 10-4)
against the Kyoto Encyclopedia of Genes and Genomes (KEGG) databases (Kanehisa
2006).

BUSCO (Benchmarking Universal Single-Copy Orthologs) was used to assess the
completeness of genomes and transcriptomes based on sets of\u00a0single
copy\u00a0orthologous groups derived from OrthoDB that are highly conserved
within multiple lineages (Felipe et al. 2015). Completed, duplicated and
fragmented orthologs were determined by meeting an \u2018expected score\u2019
and having aligned sequences within two standard deviations of the BUSCO
gene\u2019s length.\u00a0A second\u00a0metric of completeness was performed by
evaluating conserved pathways, such as the ribosome and spliceosome, using the
single-directional\u00a0best-hit\u00a0method in the KEGG Automatic Annotation
Server (KAAS) (Moriya et al. 2007).\u00a0Finally\u00a0contiguity,\u00a0was
calculated at the 0.75 level as according to Martin and Wang (2011) with
custom scripts.

For each transcriptome, unassembled sequence reads were aligned to the final
Trinity assembly using Bowtie 2 (Langmead 2012). Mapped reads were normalized
by the Reads per Kilobase per Million reads method (RPKM) (Mortazavi et al.
2008).

Gene biogeographical distributions -\u00a020 genes of interest were selected
in the study to investigate the molecular basis of iron and light limitation
in polar diatoms. Reference sequences for each of these genes were obtained
from the\u00a0F.\u00a0cylindrus\u00a0and\u00a0P.\u00a0tricornutum\u00a0JGI
genome portals
and\u00a0T.\u00a0pseudonana\u00a0and\u00a0T.\u00a0oceanica\u00a0NCBI and
GenBank repositories. Reference sequences were identified in the
transcriptomes by translated nucleotide homology searches (tBLASTn) with an
e-value cutoff of <10-5. A reciprocal tBLASTn homology search was performed
for each transcriptome against the KEGG GENES database, using the single-
directional\u00a0best-hit\u00a0method in the KAAS online tool to ensure
consistent gene annotations (Moriya et al. 2007).

Subsequently, reference sequences were identified in the MMETSP protein
database by BLASTp (e-value <10-5) homology searches among the diatom
transcriptomes. The transcriptomes and their associated latitude and longitude
were obtained from iMicrobe Data Commons (Project Code CAM_P_0001000) and the
National Center for Marine Algae and Microbiota (NCMA). Custom Matlab scripts
allowed global biogeographical distribution of key genes of interest to be
mapped.
attribute NC_GLOBAL awards_0_award_nid String 653228
attribute NC_GLOBAL awards_0_award_number String PLR-1341479
attribute NC_GLOBAL awards_0_data_url String http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1341479 (external link)
attribute NC_GLOBAL awards_0_funder_name String NSF Division of Ocean Sciences
attribute NC_GLOBAL awards_0_funding_acronym String NSF OCE
attribute NC_GLOBAL awards_0_funding_source_nid String 355
attribute NC_GLOBAL awards_0_program_manager String Dr Chris H. Fritsen
attribute NC_GLOBAL awards_0_program_manager_nid String 50502
attribute NC_GLOBAL cdm_data_type String Other
attribute NC_GLOBAL comment String Transcriptome Statistics
Adrian Marchetti, PI
Version 11 October 2016
attribute NC_GLOBAL Conventions String COARDS, CF-1.6, ACDD-1.3
attribute NC_GLOBAL creator_email String info at bco-dmo.org
attribute NC_GLOBAL creator_name String BCO-DMO
attribute NC_GLOBAL creator_type String institution
attribute NC_GLOBAL creator_url String https://www.bco-dmo.org/ (external link)
attribute NC_GLOBAL data_source String extract_data_as_tsv version 2.3 19 Dec 2019
attribute NC_GLOBAL date_created String 2016-11-18T23:55:55Z
attribute NC_GLOBAL date_modified String 2019-04-18T13:45:06Z
attribute NC_GLOBAL defaultDataQuery String &amp;time&lt;now
attribute NC_GLOBAL doi String 10.1575/1912/bco-dmo.665311.1
attribute NC_GLOBAL infoUrl String https://www.bco-dmo.org/dataset/665311 (external link)
attribute NC_GLOBAL institution String BCO-DMO
attribute NC_GLOBAL instruments_0_acronym String Inverted Microscope
attribute NC_GLOBAL instruments_0_dataset_instrument_description String Used to perform isolations
attribute NC_GLOBAL instruments_0_dataset_instrument_nid String 665318
attribute NC_GLOBAL instruments_0_description String An inverted microscope is a microscope with its light source and condenser on the top, above the stage pointing down, while the objectives and turret are below the stage pointing up. It was invented in 1850 by J. Lawrence Smith, a faculty member of Tulane University (then named the Medical College of Louisiana).

Inverted microscopes are useful for observing living cells or organisms at the bottom of a large container (e.g. a tissue culture flask) under more natural conditions than on a glass slide, as is the case with a conventional microscope. Inverted microscopes are also used in micromanipulation applications where space above the specimen is required for manipulator mechanisms and the microtools they hold, and in metallurgical applications where polished samples can be placed on top of the stage and viewed from underneath using reflecting objectives.

The stage on an inverted microscope is usually fixed, and focus is adjusted by moving the objective lens along a vertical axis to bring it closer to or further from the specimen. The focus mechanism typically has a dual concentric knob for coarse and fine adjustment. Depending on the size of the microscope, four to six objective lenses of different magnifications may be fitted to a rotating turret known as a nosepiece. These microscopes may also be fitted with accessories for fitting still and video cameras, fluorescence illumination, confocal scanning and many other applications.
attribute NC_GLOBAL instruments_0_instrument_external_identifier String https://vocab.nerc.ac.uk/collection/L05/current/LAB05/ (external link)
attribute NC_GLOBAL instruments_0_instrument_name String Inverted Microscope
attribute NC_GLOBAL instruments_0_instrument_nid String 675
attribute NC_GLOBAL instruments_0_supplied_name String Olympus CKX41
attribute NC_GLOBAL instruments_1_acronym String Bioanalyzer
attribute NC_GLOBAL instruments_1_dataset_instrument_description String Used to determine RNA integrity
attribute NC_GLOBAL instruments_1_dataset_instrument_nid String 665321
attribute NC_GLOBAL instruments_1_description String A Bioanalyzer is a laboratory instrument that provides the sizing and quantification of DNA, RNA, and proteins. One example is the Agilent Bioanalyzer 2100.
attribute NC_GLOBAL instruments_1_instrument_name String Bioanalyzer
attribute NC_GLOBAL instruments_1_instrument_nid String 626182
attribute NC_GLOBAL instruments_1_supplied_name String Agilent Bioanalyzer 2100
attribute NC_GLOBAL keywords String bco, bco-dmo, biological, busco, BUSCO_pcnt, chemical, contig, contigs, contigs_num, contiguity, data, dataset, dmo, erddap, isogroups, isogroups_num, kegg, length, management, max, max_contig_length, mean, mean_contig_length, min, min_contig_length, n50, num, oceanography, office, pcnt, preliminary, raw, raw_sequence_reads, reads, ribosome, ribosome_pcnt, sequence, size, species, spliceosome, spliceosome_pcnt, transcriptome, transcriptome_size
attribute NC_GLOBAL license String https://www.bco-dmo.org/dataset/665311/license (external link)
attribute NC_GLOBAL metadata_source String https://www.bco-dmo.org/api/dataset/665311 (external link)
attribute NC_GLOBAL param_mapping String {'665311': {}}
attribute NC_GLOBAL parameter_source String https://www.bco-dmo.org/mapserver/dataset/665311/parameters (external link)
attribute NC_GLOBAL people_0_affiliation String University of North Carolina at Chapel Hill
attribute NC_GLOBAL people_0_affiliation_acronym String UNC-Chapel Hill
attribute NC_GLOBAL people_0_person_name String Adrian Marchetti
attribute NC_GLOBAL people_0_person_nid String 527120
attribute NC_GLOBAL people_0_role String Principal Investigator
attribute NC_GLOBAL people_0_role_type String originator
attribute NC_GLOBAL people_1_affiliation String University of North Carolina at Chapel Hill
attribute NC_GLOBAL people_1_affiliation_acronym String UNC-Chapel Hill
attribute NC_GLOBAL people_1_person_name String Adrian Marchetti
attribute NC_GLOBAL people_1_person_nid String 527120
attribute NC_GLOBAL people_1_role String Contact
attribute NC_GLOBAL people_1_role_type String related
attribute NC_GLOBAL people_2_affiliation String Woods Hole Oceanographic Institution
attribute NC_GLOBAL people_2_affiliation_acronym String WHOI BCO-DMO
attribute NC_GLOBAL people_2_person_name String Hannah Ake
attribute NC_GLOBAL people_2_person_nid String 650173
attribute NC_GLOBAL people_2_role String BCO-DMO Data Manager
attribute NC_GLOBAL people_2_role_type String related
attribute NC_GLOBAL project String Polar_Transcriptomes
attribute NC_GLOBAL projects_0_acronym String Polar_Transcriptomes
attribute NC_GLOBAL projects_0_description String The Southern Ocean surrounding Antarctica is changing rapidly in response to Earth's warming climate. These changes will undoubtedly influence communities of primary producers (the organisms at the base of the food chain, particularly plant-like organisms using sunlight for energy) by altering conditions that influence their growth and composition. Because primary producers such as phytoplankton play an important role in global biogeochemical cycling, it is essential to understand how they will respond to changes in their environment. The growth of phytoplankton in certain regions of the Southern Ocean is constrained by steep gradients in chemical and physical properties that vary in both space and time. Light and iron have been identified as key variables influencing phytoplankton abundance and distribution within Antarctic waters. Microscopic algae known as diatoms are dominant members of the phytoplankton and sea ice communities, accounting for significant proportions of primary production. The overall objective of this project is to identify the molecular bases for the physiological responses of polar diatoms to varying light and iron conditions. The project should provide a means of evaluating the extent these factors regulate diatom growth and influence net community productivity in Antarctic waters. The project will also further the NSF goals of making scientific discoveries available to the general public and of training new generations of scientists. It will facilitate the teaching and learning of polar-related topics by translating the research objectives into readily accessible educational materials for middle-school students. This project will also provide funding to enable a graduate student and several undergraduate students to be trained in the techniques and perspectives of modern biology.
Although numerous studies have investigated how polar diatoms are affected by varying light and iron, the cellular mechanisms leading to their distinct physiological responses remain unknown. Using comparative transcriptomics, the expression patterns of key genes and metabolic pathways in several ecologically important polar diatoms recently isolated from Antarctic waters and grown under varying iron and irradiance conditions will be examined. In addition, molecular indicators for iron and light limitation will be developed within these polar diatoms through the identification of iron- and light-responsive genes -- the expression patterns of which can be used to determine their physiological status. Upon verification in laboratory cultures, these indicators will be utilized by way of metatranscriptomic sequencing to examine iron and light limitation in natural diatom assemblages collected along environmental gradients in Western Antarctic Peninsula waters. In order to fully understand the role phytoplankton play in Southern Ocean biogeochemical cycles, dependable methods that provide a means of elucidating the physiological status of phytoplankton at any given time and location are essential.
attribute NC_GLOBAL projects_0_end_date String 2017-07
attribute NC_GLOBAL projects_0_geolocation String Antarctica
attribute NC_GLOBAL projects_0_name String Iron and Light Limitation in Ecologically Important Polar Diatoms: Comparative Transcriptomics and Development of Molecular Indicators
attribute NC_GLOBAL projects_0_project_nid String 653229
attribute NC_GLOBAL projects_0_project_website String http://www.nsf.gov/awardsearch/showAward?AWD_ID=1341479 (external link)
attribute NC_GLOBAL projects_0_start_date String 2014-08
attribute NC_GLOBAL publisher_name String Biological and Chemical Oceanographic Data Management Office (BCO-DMO)
attribute NC_GLOBAL publisher_type String institution
attribute NC_GLOBAL sourceUrl String (local files)
attribute NC_GLOBAL standard_name_vocabulary String CF Standard Name Table v55
attribute NC_GLOBAL summary String Transcriptome statistics from samples obtained on LMG1411 collected on the Gould (LMG1411) in the Western Antarctica Peninsula in 2014. (Polar Transcriptomes project)
attribute NC_GLOBAL title String Transcriptome statistics from samples obtained on LMG1411 collected on the Gould (LMG1411) in the Western Antarctica Peninsula in 2014. (Polar Transcriptomes project)
attribute NC_GLOBAL version String 1
attribute NC_GLOBAL xml_source String osprey2erddap.update_xml() v1.3
variable species   String  
attribute species bcodmo_name String species
attribute species description String Species analyzed
attribute species long_name String Species
attribute species units String unitless
variable raw_sequence_reads   int  
attribute raw_sequence_reads _FillValue int 2147483647
attribute raw_sequence_reads actual_range int 681141, 2071629
attribute raw_sequence_reads bcodmo_name String unknown
attribute raw_sequence_reads description String Total number of raw sequence reads per species
attribute raw_sequence_reads long_name String Raw Sequence Reads
attribute raw_sequence_reads units String count
variable contigs_num   int  
attribute contigs_num _FillValue int 2147483647
attribute contigs_num actual_range int 6029, 44909
attribute contigs_num bcodmo_name String unknown
attribute contigs_num description String Number of contigs per species.
attribute contigs_num long_name String Contigs Num
attribute contigs_num units String count
variable isogroups_num   int  
attribute isogroups_num _FillValue int 2147483647
attribute isogroups_num actual_range int 4784, 42346
attribute isogroups_num bcodmo_name String unknown
attribute isogroups_num description String Number of isogroups per species.
attribute isogroups_num long_name String Isogroups Num
attribute isogroups_num units String count
variable transcriptome_size   float  
attribute transcriptome_size _FillValue float NaN
attribute transcriptome_size actual_range float 2.2, 22.9
attribute transcriptome_size bcodmo_name String unknown
attribute transcriptome_size description String Transcriptome size by species.
attribute transcriptome_size long_name String Transcriptome Size
attribute transcriptome_size units String Megabase
variable mean_contig_length   short  
attribute mean_contig_length _FillValue short 32767
attribute mean_contig_length actual_range short 338, 687
attribute mean_contig_length bcodmo_name String length
attribute mean_contig_length description String Average contig length by species.
attribute mean_contig_length long_name String Mean Contig Length
attribute mean_contig_length units String base pair
variable max_contig_length   short  
attribute max_contig_length _FillValue short 32767
attribute max_contig_length actual_range short 5810, 8191
attribute max_contig_length bcodmo_name String length
attribute max_contig_length description String Maximum contig length by species.
attribute max_contig_length long_name String Max Contig Length
attribute max_contig_length units String base pair
variable min_contig_length   short  
attribute min_contig_length _FillValue short 32767
attribute min_contig_length actual_range short 200, 224
attribute min_contig_length bcodmo_name String length
attribute min_contig_length description String Minimum contig length by species.
attribute min_contig_length long_name String Min Contig Length
attribute min_contig_length units String base pair
variable N50   short  
attribute N50 _FillValue short 32767
attribute N50 actual_range short 315, 935
attribute N50 bcodmo_name String length
attribute N50 description String N50 value; N50 length is defined as the shortest sequence length at 50% of the genome
attribute N50 long_name String N50
attribute N50 units String unitless
variable contiguity   float  
attribute contiguity _FillValue float NaN
attribute contiguity actual_range float 0.07, 0.25
attribute contiguity bcodmo_name String unknown
attribute contiguity description String Contiguity threshold 0.75
attribute contiguity long_name String Contiguity
attribute contiguity units String unitless
variable BUSCO_pcnt   byte  
attribute BUSCO_pcnt _FillValue byte 127
attribute BUSCO_pcnt actual_range byte 7, 56
attribute BUSCO_pcnt bcodmo_name String unknown
attribute BUSCO_pcnt description String Completeness of genome based on 429 core eukaryotic genes
attribute BUSCO_pcnt long_name String BUSCO Pcnt
attribute BUSCO_pcnt units String percent
variable spliceosome_pcnt   byte  
attribute spliceosome_pcnt _FillValue byte 127
attribute spliceosome_pcnt actual_range byte 21, 87
attribute spliceosome_pcnt bcodmo_name String unknown
attribute spliceosome_pcnt description String Spliceosome KAAS pathway completeness
attribute spliceosome_pcnt long_name String Spliceosome Pcnt
attribute spliceosome_pcnt units String percent
variable ribosome_pcnt   byte  
attribute ribosome_pcnt _FillValue byte 127
attribute ribosome_pcnt actual_range byte 60, 81
attribute ribosome_pcnt bcodmo_name String unknown
attribute ribosome_pcnt description String Ribosome KAAS pathway completeness
attribute ribosome_pcnt long_name String Ribosome Pcnt
attribute ribosome_pcnt units String percent
variable KEGG   String  
attribute KEGG bcodmo_name String unknown
attribute KEGG description String KEGG value; Functionally annotated contigs
attribute KEGG long_name String KEGG
attribute KEGG units String count

The information in the table above is also available in other file formats (.csv, .htmlTable, .itx, .json, .jsonlCSV1, .jsonlCSV, .jsonlKVP, .mat, .nc, .nccsv, .tsv, .xhtml) via a RESTful web service.


 
ERDDAP, Version 2.02
Disclaimers | Privacy Policy | Contact