Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agat_sp_extract_sequences.pl Could not open index file but it exists #479

Open
hans-vg opened this issue Aug 30, 2024 · 1 comment
Open

Comments

@hans-vg
Copy link

hans-vg commented Aug 30, 2024

Describe the bug
When running the command, it gets towards the bottom of the process but then errors saying the it could not open the index file. This is a resulting GFF file from MAKER output.

General (please complete the following information):

  • AGAT version v1.4.0
  • AGAT installation/use miniforge mamba install
  • OS: Ubuntu

To Reproduce
agat_sp_extract_sequences.pl --gff highquality_set_aed03.gff --fasta JG_Nov2021_contig.fa -t cds -o highquality_set_aed03.CDS.fasta

Additional context
Log file:

agat_sp_extract_sequences.pl --gff highquality_set_aed03.gff --fasta JG_Nov2021_contig.fa -t cds -o highquality_set_aed03.CDS.fasta

 ------------------------------------------------------------------------------
|   Another GFF Analysis Toolkit (AGAT) - Version: v1.4.0                      |
|   https://github.com/NBISweden/AGAT                                          |
|   National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se         |
 ------------------------------------------------------------------------------
=> Using standard /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/auto/share/dist/AGAT/agat_config.yaml config file
We will extract the cds sequences.
Reading file highquality_set_aed03.gff
                                        
                                       
                          ------ Start parsing ------                           
-------------------------- parse options and metadata --------------------------
=> Accessing the feature_levels YAML file
Using standard /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/auto/share/dist/AGAT/feature_levels.yaml file
=> Attribute used to group features when no Parent/ID relationship exists (i.e common tag):
	* locus_tag
	* gene_id
=> merge_loci option deactivated
=> Machine information:
	This script is being run by perl v5.32.1
	Bioperl location being used: /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/
	Operating system being used: linux 
=> Accessing Ontology
	No ontology accessible from the gff file header!
	We use the SOFA ontology distributed with AGAT:
		/home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/auto/share/dist/AGAT/so.obo
	Read ontology /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/auto/share/dist/AGAT/so.obo:
		4 root terms, and 2596 total terms, and 1516 leaf terms
	Filtering ontology:
		We found 1861 terms that are sequence_feature or is_a child of it.
--------------------------------- parsing file ---------------------------------
=> Number of line in file: 1615817
=> Number of comment lines: 0
=> Fasta included: No
=> Number of features lines: 1615817
=> Number of feature type (3rd column): 6
	* Level1: 1 => gene
	* level2: 1 => mRNA
	* level3: 4 => exon three_prime_UTR five_prime_UTR CDS
	* unknown: 0 => 
=> Version of the Bioperl GFF parser selected by AGAT: 3
Parsing: 100% [======================================================]D 0h03m05s
                 ------ End parsing (done in 188 second) ------                 


                           ------ Start checks ------                           
---------------------------- Check1: feature types -----------------------------
----------------------------------- ontology -----------------------------------
All feature types in agreement with the Ontology.
------------------------------------- agat -------------------------------------
AGAT can deal with all the encountered feature types (3rd column)
------------------------------ done in 0 seconds -------------------------------

------------------------------ Check2: duplicates ------------------------------
None found
------------------------------ done in 0 seconds -------------------------------

-------------------------- Check3: sequential bucket ---------------------------
Nothing to check as sequential bucket!
------------------------------ done in 0 seconds -------------------------------

--------------------------- Check4: l2 linked to l3 ----------------------------
No problem found
------------------------------ done in 1 seconds -------------------------------

--------------------------- Check5: l1 linked to l2 ----------------------------
No problem found
------------------------------ done in 1 seconds -------------------------------

--------------------------- Check6: remove orphan l1 ---------------------------
We remove only those not supposed to be orphan
None found
------------------------------ done in 0 seconds -------------------------------

------------------------- Check7: all level3 locations -------------------------
------------------------------ done in 24 seconds ------------------------------

------------------------------ Check8: check cds -------------------------------
No problem found
------------------------------ done in 0 seconds -------------------------------

----------------------------- Check9: check exons ------------------------------
No exons created
No exons locations modified
No supernumerary exons removed
No level2 locations modified
------------------------------ done in 18 seconds ------------------------------

----------------------------- Check10: check utrs ------------------------------
No UTRs created
No UTRs locations modified
No supernumerary UTRs removed
------------------------------ done in 11 seconds ------------------------------

------------------------ Check11: all level2 locations -------------------------
No problem found
------------------------------ done in 16 seconds ------------------------------

------------------------ Check12: all level1 locations -------------------------
No problem found
------------------------------ done in 2 seconds -------------------------------

---------------------- Check13: remove identical isoforms ----------------------
None found
------------------------------ done in 0 seconds -------------------------------
                  ------ End checks (done in 73 second) ------                  


Parsing Finished

------------- EXCEPTION -------------
MSG: Could not open index file JG_Nov2021_contig.fa.index: No such file or directory
STACK Bio::DB::IndexedBase::_open_index /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/DB/IndexedBase.pm:678
STACK Bio::DB::IndexedBase::_index_files /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/DB/IndexedBase.pm:655
STACK Bio::DB::IndexedBase::index_file /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/DB/IndexedBase.pm:488
STACK Bio::DB::IndexedBase::new /home/hvasquezgross/miniforge3/envs/agat/lib/perl5/site_perl/Bio/DB/IndexedBase.pm:365
STACK toplevel /home/hvasquezgross/miniforge3/envs/agat/bin/agat_sp_extract_sequences.pl:158

File listing after run:
ls -alh

drwxrwxr-x 2 hvasquezgross hvasquezgross   14 Aug 29 16:51 .
drwxrwxr-x 3 hvasquezgross hvasquezgross   18 Aug 29 15:23 ..
-rw-rw-r-- 1 hvasquezgross hvasquezgross 5.4K Aug 29 17:03 highquality_set_aed03.agat.log
-rw-rw-r-- 1 hvasquezgross hvasquezgross    0 Aug 29 16:58 highquality_set_aed03.CDS.fasta
-rw-r--r-- 1 hvasquezgross hvasquezgross 279M Aug 29 15:53 highquality_set_aed03.gff
-rw-rw-r-- 1 hvasquezgross hvasquezgross 3.1G Aug 29 16:00 JG_Nov2021_contig.fa
-rw-r--r-- 1 hvasquezgross hvasquezgross 336K Aug 29 16:05 JG_Nov2021_contig.fa.index
@Juke34
Copy link
Collaborator

Juke34 commented Sep 2, 2024

No idea, this is strange. Did you try by removing the index and re-run ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants