Skip to content

Search database against itself #385

@Xillanne

Description

@Xillanne

Hello,

I've build my own database with msa file infered with mafft. I want to compare all of the profile against each other, now I've made a loop against each entry of my db to get their hhm profile and search it against my db.
Do I have another way to do it ? For example, to search directly a db against a db ?

Thanks for your answer !

There is my code.

ffindex_build -s -f ${current_path}/msa_files.txt ${current_path}/msa.ffdata ${current_path}/msa.ffindex

# Create A3M format
echo "Converting to A3M format..."
# Rebuild ffindex for A3M
OMP_NUM_THREADS=1 mpirun -np $num_threads \
  ffindex_apply_mpi ${current_path}/msa.ff{data,index}  \
   -i ${current_path}/msa_a3m.ffindex -d ${current_path}/msa_a3m.ffdata \
    -- hhconsensus -M 50 -maxres 65535 -i stdin -oa3m stdout -v 0 > "${current_path}/hhconsensus_mpi_errors.log"

# Create HHMs
# Create ffindex for HHM
echo "Creating HHM profiles..."
OMP_NUM_THREADS=1 mpirun -np $num_threads \
  ffindex_apply_mpi ${current_path}/msa_a3m.ff{data,index} \
  -i ${current_path}/msa_hhm.ffindex -d ${current_path}/msa_hhm.ffdata \
  -- hhmake -i stdin -o stdout -v 0 -id 100 > "${current_path}/hhmake_mpi_errors.log"

cd ${current_path}

sort msa_hhm.ffindex > msa_hhm.ffindex.simpleSort
sort msa_a3m.ffindex > msa_a3m.ffindex.simpleSort
mv msa_a3m.ffindex msa_a3m.ffindex.orig
mv msa_hhm.ffindex msa_hhm.ffindex.orig
ln -s msa_a3m.ffindex.simpleSort msa_a3m.ffindex
ln -s msa_hhm.ffindex.simpleSort msa_hhm.ffindex


# Create index for hhdatabase
cstranslate -f -i msa -o msa_cs219 -A /data/desmarais/Applications/hh-suite/data/cs219.lib -D /data/desmarais/Applications/hh-suite/data/context_data.lib -x 0.3 -c 4 -I a3m -b

# Putting everything together
sort -k3 -n msa_cs219.ffindex | cut -f1 > sorting.dat

ffindex_order sorting.dat msa_hhm.ff{data,index} msa_hhm_ordered.ff{data,index}
mv msa_hhm_ordered.ffindex msa_hhm.ffindex
mv msa_hhm_ordered.ffdata msa_hhm.ffdata

ffindex_order sorting.dat msa_a3m.ff{data,index} msa_a3m_ordered.ff{data,index}
mv msa_a3m_ordered.ffindex msa_a3m.ffindex
mv msa_a3m_ordered.ffdata msa_a3m.ffdata

cd "$output_dir" || exit
count=0

echo "Performing profile-profile comparison..."
for i in *.msa; do
  (
    name=$(sed -n '1p' "$i" | tr -d ">")
    file_prefix=$(basename "$i" .msa)

    # Create hhm
    ffindex_get ${current_path}/msa_hhm.ffdata ${current_path}/msa_hhm.ffindex $i > "${file_prefix}.hhm"

    # Run hhsearch
    hhsearch -i "${file_prefix}.hhm" -d ${current_path}/msa -o "${current_path}/${output_hhr_dir}/${file_prefix}_result.hhr"
  ) &

  ((count++))
  if (( count % max_jobs == 0 )); then
    wait  # Wait for the current batch
  fi
done

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions