User Suggestions


bounds

It may be nice to extend this useful function to produce boundaries without reordering the vector.
For example, given the vector a:

a <- c(2,3,4,5,6,7,8,9,10,
20,21,22,23,
45,46,47,48,49,50,
20,21,22,23,24,25)

Currently:

> bounds(a)

start end length
1 2 10 9
2 20 25 6
3 45 50 6

There could be a flag to indicate no sorting.

> bounds(a, sort=FALSE)

start end length
1 2 10 9
2 20 23 4
3 45 50 6
4 20 25 6

fit.xyz

Would be useful to have a flag to write partial.pdbs, so that it will create pdb files of the portions used in the superposition only (instead of the full pdbs).

identity

Currently, it returns NA for sequences in an alignment which have no sequence identity

ide.mat <- identity(aln)

It would be better if these were considered as 0, for no sequence identity

ide.mat[which(is.na(ide.mat))] <- 0

read.fasta.pdb

1. It would be useful if this function could tolerate missing pdb files.
A warning should be issued, but it would allow it to cope with large sequence alignments where only a few proteins a 3D structure.
The work around is to create a subset-aligment for those proteins with structures.

2. I had a problem reading a PDB file with this function. The first residue in the PDB file was numbered 0 and coordinates were only given for 2 of it's atoms. When the residue is considered as part of the sequence in alignment, read.fasta.pdb crashed because it thinks the sequence it has for the protein does not match the PDB files coordinates. (pdb file, fasta file) This is because read.fasta.pdb looks at Calphas. Thus, if a residue does not have a Calpha, it should not be included in the alignment file.

read.pdb

In PDB files, selenomethionines are usually stored in the HETATM card, as residue type MSE.
In some cases, the user may want to store their coordinates along with the other atoms.
A simple way to achieve this would be to have a flag to turn on selenomethionine processing (default would be off/FALSE):
pdb <- read.pdb(file="file1.pdb", mse=TRUE)

The read.pdb function could catch this and simply change all MSE HETATM into ATOM cards, before performing any other processing.

Example:

if (mse == TRUE) {
    system(paste(sed -i '/MSE/s/HETATM/ATOM /'), file)
}
...

Or, without altering the user's PDB file:

if (mse == TRUE) {
    system(paste(sed -i '/MSE/s/HETATM/ATOM /'), file, "> tmp"))
    file <- tmp
}
....
if (mse == TRUE) {
    system("rm tmp")
}

write.pdb

Encountered a problem when using this function with het=TRUE.
I gave it a pdb object and wanted it to output HETATM coordinates, but got the following error message:

"write.pdb: the lengths of all input vectors != 'length(xyz)/3'"

To fix it, within the if loop on line 26, I added a couple of lines that supply the hetatm info to both the natom and the xyz structures. In addition, I added a fix in case only one HETATM card is supplied, making pdb$het a vector rather than a matrix.
I am not sure wether this has broken the multi model functionality, etc.

if (het) {
     if(is.vector(pdb$het)) {
         pdb$het <- matrix(pdb$het, nrow=1, ncol=13, dimnames=list("",names(pdb$het)))
     }
     card <- c(rep("ATOM", nrow(pdb$atom)), rep("HETATM", nrow(pdb$het)))
     natom <- natom + length(pdb$het[,1])
     xyz <- c(xyz, as.numeric(t(pdb$het[,c("x","y","z")])))
}


Page Information

  • 1 year ago [history]
  • View page source
  • You're not logged in
  • No tags yet learn more

Wiki Information

Recent PBwiki Blog Posts