Download a CSV file from the PDB site (accessible from “Analyze” > “PDB Statistics” > “by Experimental Method and Molecular Type”. Move this CSV file into your RStudio project and use it to answer the following questions:
Q1: What percentage of structures in the PDB are solved by X-Ray and Electron Microscopy. sum of x ray+ EM total/ sum of all values **(199931+29978)/244290*100%= 94.5%**
Q2: What proportion of structures in the PDB are protein? **239193/244290*100%=97.9%**
Q3: Type HIV in the PDB website search box on the home page and determine how many HIV-1 protease structures are in the current PDB? 4866
Q4: Water molecules normally have 3 atoms. Why do we see just one atom per water molecule in this structure? *Each water molecule is represented by a single oxygen atom because hydrogens are invisible in X-ray structures
Q5: There is a critical “conserved” water molecule in the binding site. Can you identify this water molecule? What residue number does this water molecule have HOH 308
Q6: Generate and save a figure clearly showing the two distinct chains of HIV-protease along with the ligand. You might also consider showing the catalytic residues ASP 25 in each chain and the critical water (we recommend “Ball & Stick” for these side-chains). Add this figure to your Quarto document.
library(bio3d)
pdb <- read.pdb("1hsg")
## Note: Accessing on-line PDB file
pdb
##
## Call: read.pdb(file = "1hsg")
##
## Total Models#: 1
## Total Atoms#: 1686, XYZs#: 5058 Chains#: 2 (values: A B)
##
## Protein Atoms#: 1514 (residues/Calpha atoms#: 198)
## Nucleic acid Atoms#: 0 (residues/phosphate atoms#: 0)
##
## Non-protein/nucleic Atoms#: 172 (residues: 128)
## Non-protein/nucleic resid values: [ HOH (127), MK1 (1) ]
##
## Protein sequence:
## PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYD
## QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNFPQITLWQRPLVTIKIGGQLKE
## ALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGPTP
## VNIIGRNLLTQIGCTLNF
##
## + attr: atom, xyz, seqres, helix, sheet,
## calpha, remark, call
Q7: How many amino acid residues are there in this pdb object? 198 Q8: Name one of the two non-protein residues? HOH Q9: How many protein chains are in this structure? 2
attributes(pdb)
## $names
## [1] "atom" "xyz" "seqres" "helix" "sheet" "calpha" "remark" "call"
##
## $class
## [1] "pdb" "sse"
Q10. Which of the packages above is found only on BioConductor and not CRAN? msa
Q11. Which of the above packages is not found on BioConductor or CRAN?: bio3dview
Q12. True or False? Functions from the pak package can be used to install packages from GitHub and BitBucket? T
Q13.How many amino acids are in this sequence, i.e. how long is this sequence? 214 amino acids