bcodmo_dataset_812936

Name: [IODP360 - FPKM values] - Supplementary Table 4A: Metatranscriptome data summary for cellular activities presented and statistics on sequencing and removal of potential contaminant sequences, FPKM values (Collaborative Research: Delineating The Microbial Diversity and Cross-domain Interactions in The Uncharted Subseafloor Lower Crust Using Meta-omics and Culturing Approaches)
Creator: BCO-DMO
License: https://www.bco-dmo.org/dataset/812936/license

Grid DAP Data	Sub- set	Table DAP Data	Make A Graph	W M S	Source Data Files	Acces- sible	Title	Sum- mary	FGDC, ISO, Metadata	Back- ground Info	RSS	E mail	Institution	Dataset ID
		data	graph		files	public	[IODP360 - FPKM values] - Supplementary Table 4A: Metatranscriptome data summary for cellular activities presented and statistics on sequencing and removal of potential contaminant sequences, FPKM values (Collaborative Research: Delineating The Microbial Diversity and Cross- domain Interactions in The Uncharted Subseafloor Lower Crust Using Meta-omics and Culturing Approaches)		M	background			BCO-DMO	bcodmo_dataset_812936

The Dataset's Variables and Attributes

Row Type	Variable Name	Attribute Name	Data Type	Value
attribute	NC_GLOBAL	access_formats	String	.htmlTable,.csv,.json,.mat,.nc,.tsv
attribute	NC_GLOBAL	acquisition_description	String	Frozen rock material was crushed as above, and then ground quickly into a fine powder using a precooled sterilized mortar and pestle, and then RNA extraction started immediately. The jaw crusher was cleaned and rinsed with 70% ethanol and RNaseZap\u2122 RNase Decontamination Solution (Invitrogen, USA) between samples. About 40 g of material was extracted for each sample using the RNeasy PowerSoil Total RNA Isolation Kit (Qiagen, USA) according to the manufacturer\u2019s protocol with the following modifications. Each sample was evenly divided into 8 Bead Tubes (Qiagen, USA) and then 2.5 mL of Bead Solution were added into the Bead Tube followed by 0.25 mL of Solution SR1 and 0.8 mL of Solution SR2. Bead Tubes were frozen in liquid nitrogen and then thawed at 65\u00b0C in a water bath three times. RNA was purified using the MEGAclear Transcription Clean-up Kit (Ambion, USA) and concentrated with an overnight isopropanol precipitation at 4 \u00b0C. Trace amounts of contaminating DNA were removed from the RNA extracts using TURBO DNA free\u2122 (Invitrogen, USA) as directed by the manufacturer. To ensure DNA was removed thoroughly, each RNA extract was treated twice with TURBO DNase (Invitrogen, USA). A nested PCR reaction (2 x 35 cycles) using bacterial primers was used to confirm the absence of DNA in our RNA solutions. RNA was converted to cDNA using the Ovation\u00ae RNA-Seq System V2 kit (NuGEN, USA) according to the manufacturer\u2019s protocol to preferentially prime non-rRNA sequences. The cDNA was purified with the MinElute Reaction Cleanup Kit (Qiagen, USA) and eluted into 20 \u03bcL elution buffer. Extracts were quantified using a Qubit Fluorometer (Life Technologies, USA) and cDNAs were stored at -80 \u00b0C until sequencing using 150 bp paired-end Illumina NextSeq 550. To control for potential contaminants introduced during drilling, sample handling, and laboratory kit reagents, we sequenced a number of control samples as above. Two samples controlled for potential nucleic acid contamination; a \u201cmethod\u201d control to monitor possible contamination from our laboratory extractions, which included ~ 40 g sterilized glass beads processed through the entire protocol in place of rock, and a \u201ckit\u201d control to account for any signal coming from trace contaminants in kit reagents, which received no addition. In addition, 3 more controls were extracted: a sample of the drilling mud (Sepiolite), and two drilling seawater samples collected during the first and third weeks of drilling. cDNA obtained from these controls were sequenced together with the rock samples and co- assembled. Trimmomatic (v. 0.32) was used to trim adapter sequences (leading=20, trailing=20, sliding window=04:24, minlen=50). Paired reads were further quality checked and trimmed using FastQC (v. 0.11.7) and FASTX-toolkit (v. 0.014). Downstream analyses utilized paired reads. After co-assembling reads with Trinity (v. 2.4.0) from all controls (min length 150 bp), Bowtie2 (v. 2.3.4.1, 50) was used (with the parameter \u2018un- conc\u2019) to align all sample reads to this co-assembly. Reads that mapped to our control co-assembly allowing 1 mismatch were removed from further analysis (23.5-68.5% of sequences remained in sample data sets, see Supplementary Table 4). Trinity (v. 2.4.0) was used for de novo assembly of the remaining reads in sample data sets (min. length 150 bp). Bowtie aligner was used to align reads to assembled contigs, RSEM was used to estimate the expression level of these reads, and TMM was used to perform cross sample normalization and to generate a TMM-normalized expression matrix. Within the Trinotate suite, TransDecoder (v. 3.0.1) was used to identify coding regions within contigs and functional and taxonomic annotation was made 622 by BLASTx and BLASTp against UniProt, Swissprot (release 2018_02) and RefSeq non- redundant protein sequence (nr) databases (e-value threshold of 1e-5). BLASTp was used to look for sequence homologies with the same e-values. HMMER (v. 3.1b2) was used to identify conserved domains by searching against the Pfam (v31.0) database. SignalP (v. 4.1) and TMHMM (2.0c) were used to predict signal peptides and transmembrane domains. RNAMMER (v.1.2) was used to identify rRNA homologies of archaea, bacteria and eukaryotes. Because the Swissprot database does not have extensive representation of protein sequences from environmental samples, particularly deep-sea and deep biosphere samples, annotations of contigs utilized for analyses of selected processes were manually cross checked by BLASTx against GenBank nr database. Aside from removing any reads that mapped well to our control co-assembly (1 mismatch), as an extra precaution, any sequence that exhibited \u2265 95% sequence identity over \u2265 80% of the sequence length to suspected contaminants (e.g., human pathogens, plants, or taxa known to be common molecular kit reagent contaminants, and not described from the marine environment) as in Salter et al. and Glassing et al. were removed. This conservative approach potentially removed environmentally relevant data that were annotated to suspected contaminants due to poor taxonomic representation from environmental taxa in public databases, however it affords the highest possible confidence about any transcripts discussed. Additional functional annotations of contigs were obtained by BLAST against the KEGG, COG, SEED, and MetaCyc databases using MetaPathways (v. 2.0) to gain insights into particular cellularprocesses, and to provide overviews of metabolic functions across samples based on comparisons of FPKM-normalized data. All annotations were integrated into a SQLite database for further analysis.
attribute	NC_GLOBAL	awards_0_award_nid	String	709555
attribute	NC_GLOBAL	awards_0_award_number	String	OCE-1658031
attribute	NC_GLOBAL	awards_0_data_url	String	http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1658031
attribute	NC_GLOBAL	awards_0_funder_name	String	NSF Division of Ocean Sciences
attribute	NC_GLOBAL	awards_0_funding_acronym	String	NSF OCE
attribute	NC_GLOBAL	awards_0_funding_source_nid	String	355
attribute	NC_GLOBAL	awards_0_program_manager	String	David L. Garrison
attribute	NC_GLOBAL	awards_0_program_manager_nid	String	50534
attribute	NC_GLOBAL	cdm_data_type	String	Other
attribute	NC_GLOBAL	Conventions	String	COARDS, CF-1.6, ACDD-1.3
attribute	NC_GLOBAL	creator_email	String	info at bco-dmo.org
attribute	NC_GLOBAL	creator_name	String	BCO-DMO
attribute	NC_GLOBAL	creator_type	String	institution
attribute	NC_GLOBAL	creator_url	String	https://www.bco-dmo.org/
attribute	NC_GLOBAL	data_source	String	extract_data_as_tsv version 2.3 19 Dec 2019
attribute	NC_GLOBAL	dataset_current_state	String	Final and no updates
attribute	NC_GLOBAL	date_created	String	2020-05-26T20:31:21Z
attribute	NC_GLOBAL	date_modified	String	2020-07-08T20:46:44Z
attribute	NC_GLOBAL	defaultDataQuery	String	&time<now
attribute	NC_GLOBAL	infoUrl	String	https://www.bco-dmo.org/dataset/812936
attribute	NC_GLOBAL	institution	String	BCO-DMO
attribute	NC_GLOBAL	instruments_0_acronym	String	Automated Sequencer
attribute	NC_GLOBAL	instruments_0_dataset_instrument_description	String	RNA sequencing was performed using the Illumina NextSeq 550 platform (Univ. of Georgia).v
attribute	NC_GLOBAL	instruments_0_dataset_instrument_nid	String	813310
attribute	NC_GLOBAL	instruments_0_description	String	General term for a laboratory instrument used for deciphering the order of bases in a strand of DNA. Sanger sequencers detect fluorescence from different dyes that are used to identify the A, C, G, and T extension reactions. Contemporary or Pyrosequencer methods are based on detecting the activity of DNA polymerase (a DNA synthesizing enzyme) with another chemoluminescent enzyme. Essentially, the method allows sequencing of a single strand of DNA by synthesizing the complementary strand along it, one base pair at a time, and detecting which base was actually added at each step.
attribute	NC_GLOBAL	instruments_0_instrument_name	String	Automated DNA Sequencer
attribute	NC_GLOBAL	instruments_0_instrument_nid	String	649
attribute	NC_GLOBAL	instruments_0_supplied_name	String	Illumina NextSeq 550 platform
attribute	NC_GLOBAL	keywords	String	bco, bco-dmo, biological, biosynthetic, Biosynthetic_pathway, chemical, cycle, data, dataset, dmo, erddap, ID_19R1, ID_26R2, ID_2R1, ID_31R1, ID_42R2, ID_51R3, ID_62R1, ID_68R4, ID_71R1, ID_81R2, ID_84R6, management, oceanography, office, pathway, preliminary
attribute	NC_GLOBAL	license	String	https://www.bco-dmo.org/dataset/812936/license
attribute	NC_GLOBAL	metadata_source	String	https://www.bco-dmo.org/api/dataset/812936
attribute	NC_GLOBAL	param_mapping	String	{'812936': {}}
attribute	NC_GLOBAL	parameter_source	String	https://www.bco-dmo.org/mapserver/dataset/812936/parameters
attribute	NC_GLOBAL	people_0_affiliation	String	Woods Hole Oceanographic Institution
attribute	NC_GLOBAL	people_0_affiliation_acronym	String	WHOI
attribute	NC_GLOBAL	people_0_person_name	String	Virginia P. Edgcomb
attribute	NC_GLOBAL	people_0_person_nid	String	51284
attribute	NC_GLOBAL	people_0_role	String	Principal Investigator
attribute	NC_GLOBAL	people_0_role_type	String	originator
attribute	NC_GLOBAL	people_1_affiliation	String	Woods Hole Oceanographic Institution
attribute	NC_GLOBAL	people_1_affiliation_acronym	String	WHOI
attribute	NC_GLOBAL	people_1_person_name	String	Virginia P. Edgcomb
attribute	NC_GLOBAL	people_1_person_nid	String	51284
attribute	NC_GLOBAL	people_1_role	String	Contact
attribute	NC_GLOBAL	people_1_role_type	String	related
attribute	NC_GLOBAL	people_2_affiliation	String	Woods Hole Oceanographic Institution
attribute	NC_GLOBAL	people_2_affiliation_acronym	String	WHOI BCO-DMO
attribute	NC_GLOBAL	people_2_person_name	String	Karen Soenen
attribute	NC_GLOBAL	people_2_person_nid	String	748773
attribute	NC_GLOBAL	people_2_role	String	BCO-DMO Data Manager
attribute	NC_GLOBAL	people_2_role_type	String	related
attribute	NC_GLOBAL	project	String	Subseafloor Lower Crust Microbiology
attribute	NC_GLOBAL	projects_0_acronym	String	Subseafloor Lower Crust Microbiology
attribute	NC_GLOBAL	projects_0_description	String	NSF abstract: The lower ocean crust has remained largely unexplored and represents one of the last frontiers for biological exploration on Earth. Preliminary data indicate an active subsurface biosphere in samples of the lower oceanic crust collected from Atlantis Bank in the SW Indian Ocean as deep as 790 m below the seafloor. Even if life exists in only a fraction of the habitable volume where temperatures permit and fluid flow can deliver carbon and energy sources, an active lower oceanic crust biosphere would have implications for deep carbon budgets and yield insights into microbiota that may have existed on early Earth. This is all of great interest to other research disciplines, educators, and students alike. A K-12 education program will capitalize on groundwork laid by outreach collaborator, A. Martinez, a 7th grade teacher in Eagle Pass, TX, who sailed as outreach expert on Drilling Expedition 360. Martinez works at a Title 1 school with ~98% Hispanic and ~2% Native American students and a high number of English Language Learners and migrants. Annual school visits occur during which the project investigators present hands on-activities introducing students to microbiology, and talks on marine microbiology, the project, and how to pursue science related careers. In addition, monthly Skype meetings with students and PIs update them on project progress. Students travel to the University of Texas Marine Science Institute annually, where they get a campus tour and a 3-hour cruise on the R/V Katy, during which they learn about and help with different oceanographic sampling approaches. The project partially supports two graduate students, a Woods Hole undergraduate summer student, the participation of multiple Texas A+M undergraduate students, and 3 principal investigators at two institutions, including one early career researcher who has not previously received NSF support of his own. Given the dearth of knowledge of the lower oceanic crust, this project is poised to transform our understanding of life in this vast environment. The project assesses metabolic functions within all three domains of life in this crustal biosphere, with a focus on nutrient cycling and evaluation of connections to other deep marine microbial habitats. The lower ocean crust represents a potentially vast biosphere whose microbial constituents and the biogeochemical cycles they mediate are likely linked to deep ocean processes through faulting and subsurface fluid flow. Atlantis Bank represents a tectonic window that exposes lower oceanic crust directly at the seafloor. This enables seafloor drilling and research on an environment that can transform our understanding of connections between the deep subseafloor biosphere and the rest of the ocean. Preliminary analysis of recovered rocks from Expedition 360 suggests the interaction of seawater with the lower oceanic crust creates varied geochemical conditions capable of supporting diverse microbial life by providing nutrients and chemical energy. This project is the first interdisciplinary investigation of the microbiology of all 3 domains of life in basement samples that combines diversity and "meta-omics" analyses, analysis of nutrient addition experiments, high-throughput culturing and physiological analyses of isolates, including evaluation of their ability to utilize specific carbon sources, Raman spectroscopy, and lipid biomarker analyses. Comparative genomics are used to compare genes and pathways relevant to carbon cycling in these samples to data from published studies of other deep-sea environments. The collected samples present a rare and time-sensitive opportunity to gain detailed insights into microbial life, available carbon and energy sources for this life, and of dispersal of microbiota and connections in biogeochemical processes between the lower oceanic crust and the overlying aphotic water column. About the study area: The International Ocean Discovery Program (IODP) Expedition 360 explored the lower crust at Atlantis Bank, a 12 Ma oceanic core complex on the ultraslow-spreading SW Indian Ridge. This oceanic core complex represents a tectonic window that exposes lower oceanic crust and mantle directly at the seafloor, and the expedition provided an unprecedented opportunity to access this habitat in the Indian Ocean.
attribute	NC_GLOBAL	projects_0_end_date	String	2020-01
attribute	NC_GLOBAL	projects_0_geolocation	String	SW Indian Ridge, Indian Ocean
attribute	NC_GLOBAL	projects_0_name	String	Collaborative Research: Delineating The Microbial Diversity and Cross-domain Interactions in The Uncharted Subseafloor Lower Crust Using Meta-omics and Culturing Approaches
attribute	NC_GLOBAL	projects_0_project_nid	String	709556
attribute	NC_GLOBAL	projects_0_start_date	String	2017-02
attribute	NC_GLOBAL	publisher_name	String	Biological and Chemical Oceanographic Data Management Office (BCO-DMO)
attribute	NC_GLOBAL	publisher_type	String	institution
attribute	NC_GLOBAL	sourceUrl	String	(local files)
attribute	NC_GLOBAL	standard_name_vocabulary	String	CF Standard Name Table v55
attribute	NC_GLOBAL	summary	String	Supplementary Table 4A: Metatranscriptome data summary for cellular activities presented and statistics on sequencing and removal of potential contaminant sequences: FPKM values. Samples taken on board of the R/V JOIDES Resolution between November 30, 2015 and January 30, 2016.
attribute	NC_GLOBAL	title	String	[IODP360 - FPKM values] - Supplementary Table 4A: Metatranscriptome data summary for cellular activities presented and statistics on sequencing and removal of potential contaminant sequences, FPKM values (Collaborative Research: Delineating The Microbial Diversity and Cross-domain Interactions in The Uncharted Subseafloor Lower Crust Using Meta-omics and Culturing Approaches)
attribute	NC_GLOBAL	version	String	1
attribute	NC_GLOBAL	xml_source	String	osprey2erddap.update_xml() v1.5
variable	Cycle		String
attribute	Cycle	bcodmo_name	String	unknown
attribute	Cycle	description	String	Cycle of the biosynthetic pathway
attribute	Cycle	long_name	String	Cycle
attribute	Cycle	units	String	unitless
variable	Biosynthetic_pathway		String
attribute	Biosynthetic_pathway	bcodmo_name	String	unknown
attribute	Biosynthetic_pathway	description	String	Name of biosynthetic pathway
attribute	Biosynthetic_pathway	long_name	String	Biosynthetic Pathway
attribute	Biosynthetic_pathway	units	String	unitless
variable	ID_2R1		float
attribute	ID_2R1	_FillValue	float	NaN
attribute	ID_2R1	actual_range	float	0.0, 540.183
attribute	ID_2R1	bcodmo_name	String	unknown
attribute	ID_2R1	description	String	FPKM values per pathway for sample 2R1
attribute	ID_2R1	long_name	String	ID 2 R1
attribute	ID_2R1	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_19R1		float
attribute	ID_19R1	_FillValue	float	NaN
attribute	ID_19R1	actual_range	float	0.0, 86.455
attribute	ID_19R1	bcodmo_name	String	unknown
attribute	ID_19R1	description	String	FPKM values per pathway for sample 19R1
attribute	ID_19R1	long_name	String	ID 19 R1
attribute	ID_19R1	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_26R2		float
attribute	ID_26R2	_FillValue	float	NaN
attribute	ID_26R2	actual_range	float	0.0, 968.764
attribute	ID_26R2	bcodmo_name	String	unknown
attribute	ID_26R2	description	String	FPKM values per pathway for sample 26R2
attribute	ID_26R2	long_name	String	ID 26 R2
attribute	ID_26R2	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_31R1		float
attribute	ID_31R1	_FillValue	float	NaN
attribute	ID_31R1	actual_range	float	0.0, 256.836
attribute	ID_31R1	bcodmo_name	String	unknown
attribute	ID_31R1	description	String	FPKM values per pathway for sample 31R1
attribute	ID_31R1	long_name	String	ID 31 R1
attribute	ID_31R1	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_42R2		float
attribute	ID_42R2	_FillValue	float	NaN
attribute	ID_42R2	actual_range	float	0.0, 1003.3
attribute	ID_42R2	bcodmo_name	String	unknown
attribute	ID_42R2	description	String	FPKM values per pathway for sample 42R2
attribute	ID_42R2	long_name	String	ID 42 R2
attribute	ID_42R2	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_51R3		float
attribute	ID_51R3	_FillValue	float	NaN
attribute	ID_51R3	actual_range	float	0.0, 3001.126
attribute	ID_51R3	bcodmo_name	String	unknown
attribute	ID_51R3	description	String	FPKM values per pathway for sample 51R3
attribute	ID_51R3	long_name	String	ID 51 R3
attribute	ID_51R3	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_62R1		float
attribute	ID_62R1	_FillValue	float	NaN
attribute	ID_62R1	actual_range	float	0.0, 2625.771
attribute	ID_62R1	bcodmo_name	String	unknown
attribute	ID_62R1	description	String	FPKM values per pathway for sample 62R1
attribute	ID_62R1	long_name	String	ID 62 R1
attribute	ID_62R1	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_68R4		float
attribute	ID_68R4	_FillValue	float	NaN
attribute	ID_68R4	actual_range	float	0.0, 1097.931
attribute	ID_68R4	bcodmo_name	String	unknown
attribute	ID_68R4	description	String	FPKM values per pathway for sample 68R4
attribute	ID_68R4	long_name	String	ID 68 R4
attribute	ID_68R4	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_71R1		float
attribute	ID_71R1	_FillValue	float	NaN
attribute	ID_71R1	actual_range	float	0.0, 500.262
attribute	ID_71R1	bcodmo_name	String	unknown
attribute	ID_71R1	description	String	FPKM values per pathway for sample 71R1
attribute	ID_71R1	long_name	String	ID 71 R1
attribute	ID_71R1	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_81R2		float
attribute	ID_81R2	_FillValue	float	NaN
attribute	ID_81R2	actual_range	float	0.0, 163524.0
attribute	ID_81R2	bcodmo_name	String	unknown
attribute	ID_81R2	description	String	FPKM values per pathway for sample 81R2
attribute	ID_81R2	long_name	String	ID 81 R2
attribute	ID_81R2	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)
variable	ID_84R6		float
attribute	ID_84R6	_FillValue	float	NaN
attribute	ID_84R6	actual_range	float	0.0, 352.436
attribute	ID_84R6	bcodmo_name	String	unknown
attribute	ID_84R6	description	String	FPKM values per pathway for sample 84R6
attribute	ID_84R6	long_name	String	ID 84 R6
attribute	ID_84R6	units	String	Fragments per Kilobase of transcript per Million mapped reads (FPKM)

The information in the table above is also available in other file formats (.csv, .htmlTable, .itx, .json, .jsonlCSV1, .jsonlCSV, .jsonlKVP, .mat, .nc, .nccsv, .tsv, .xhtml) via a RESTful web service.