dbdataset is a data package containing dvobject R object. dvobjectcontains lists of different dataframes of the parsed DrugBank database. dvobject has been built using dbparser R package.
dvobject can be used for conveniently exploring and analyzing the contents of the DrugBank database. dvobject is also intended to assist in drug discovery endeavors that plan to make use of the DrugBank database.
Moreover; it also can be used to in Machine Learning in many sub-fields such as:
Although dvobject is much smaller that the unparsed DrugBank database size, it still exceeds the limit set by CRAN. So, it will be hosted on Github only for now. Hence, it could be installed via the following command.
devtools::install_github("interstellar-Consultation-Services/dbdataset")
The dvobject will then be available after running the following command:
Then a dvobject called drugbank
will be
available to be used as regular R object
dvobject introduces a unified and compressed format of drugs data. It is an R list object that contains one or more of the following sub-lists:
The following is the definition for each sub-list:
A list of data.frames that contain drugs information (i.e. synonyms, classifications, …) and it is the only mandatory list
names(dbdataset::drugbank[["drugs"]])
#> [1] "general_information" "drug_classification"
#> [3] "synonyms" "pharmacology"
#> [5] "international_brands" "mixtures"
#> [7] "packagers" "manufacturers"
#> [9] "prices" "categories"
#> [11] "dosages" "atc_codes"
#> [13] "patents" "drug_interactions"
#> [15] "sequences" "calculated_properties"
#> [17] "experimental_properties" "external_identifiers"
#> [19] "pathway" "reactions"
#> [21] "snp_effects" "snp_adverse_reactions"
#> [23] "food_interactions" "pdb_entries"
#> [25] "ahfs_codes" "affected_organisms"
#> [27] "groups" "external_links"
A data.frame contains drugs salts information
head(dbdataset::drugbank[["salts"]], 5)
#> db_salt_id name unii cas_number
#> 1 DBSALT000105 Leuprolide acetate 37JNS02E7V 74381-53-6
#> 2 DBSALT003182 Leuprolide mesylate 8E3C3C493W 944347-41-5
#> 3 DBSALT001439 Sermorelin acetate 00IBG87IQW 114466-38-5
#> 4 DBSALT000093 Goserelin acetate 6YUU2PV0U8 145781-92-6
#> 5 DBSALT001733 Insulin human zinc suspension
#> inchikey average_mass monoisotopic_mass drugbank_id
#> 1 YFDMUNOZURYOCP-XNHQSDQCSA-N 1269.473 1268.666591578 DB00007
#> 2 MBIDSOMXPLCOHS-XNHQSDQCSA-N 1305.52 1304.633577372 DB00007
#> 3 <NA> <NA> DB00010
#> 4 IKDXDQDKCZPQSZ-JHYYTBFNSA-N 1329.4624 1328.662568858 DB00014
#> 5 <NA> <NA> DB00030
A data.frame of commercially available drugs products in the world
head(dbdataset::drugbank[["products"]], 5)
#> name labeller ndc_id ndc_product_code dpd_id
#> 1 Refludan Bayer Ag 50419-150
#> 2 Refludan Bayer Ag 02240996
#> 3 Refludan Celgene Corporation
#> 4 Refludan Celgene Corporation
#> 5 Refludan Celgene Corporation
#> ema_product_code ema_ma_number started_marketing_on ended_marketing_on
#> 1 1998-03-06 2013-06-30
#> 2 2000-01-31 2013-07-26
#> 3 EMEA/H/C/000122 EU/1/97/035/001 2016-09-08 2012-04-24
#> 4 EMEA/H/C/000122 EU/1/97/035/002 2016-09-08 2012-04-24
#> 5 EMEA/H/C/000122 EU/1/97/035/003 2016-09-08 2012-04-24
#> dosage_form strength route
#> 1 Powder 50 mg/1mL Intravenous
#> 2 Powder, for solution 50 mg / vial Intravenous
#> 3 Injection, solution, concentrate 50 mg Intravenous
#> 4 Injection, solution, concentrate 50 mg Intravenous
#> 5 Injection, solution, concentrate 20 mg Intravenous
#> fda_application_number generic over_the_counter approved country source
#> 1 NDA020807 false false true US FDA NDC
#> 2 false false true Canada DPD
#> 3 false false false EU EMA
#> 4 false false false EU EMA
#> 5 false false false EU EMA
#> drugbank_id
#> 1 DB00001
#> 2 DB00001
#> 3 DB00001
#> 4 DB00001
#> 5 DB00001