Skip to content

Conversation

@LokeZhou
Copy link
Collaborator

No description provided.

@LokeZhou LokeZhou merged commit 790149c into PaddlePaddle:develop Jul 12, 2023
@LokeZhou LokeZhou deleted the automatic_label branch July 12, 2023 03:14
zhoutianzi666 pushed a commit to zhoutianzi666/PaddleMIX that referenced this pull request Aug 4, 2023
westfish pushed a commit to westfish/PaddleMIX that referenced this pull request Sep 25, 2024
lyuwenyu added a commit that referenced this pull request Feb 20, 2025
## 算子目录

- [1. 转换算子](#1-转换算子)
  - [1.1 llava转换算子](#11-llava转换算子)
    - [1.1.1 llava_convert](#111-llava_convert)
- [2. 过滤算子](#2-过滤算子)
  - [2.1 基础过滤算子](#21-基础过滤算子)
    - [2.1.1 valid_data_filter](#211-valid_data_filter)
- [2.1.1.1 image_compliance_operator](#2111-image_compliance_operator)
- [2.1.1.2
conversation_compliance_operator](#2112-conversation_compliance_operator)
  - [2.2 文本过滤算子](#22-文本过滤算子)
- [2.2.1 conversation_length_filter](#221-conversation_length_filter)
- [2.2.2 average_line_length_filter](#222-average_line_length_filter)
- [2.2.3 maximum_line_length_filter](#223-maximum_line_length_filter)
- [2.2.4
conversation_percentage_filter](#224-conversation_percentage_filter)
    - [2.2.5 token_num_filter](#225-token_num_filter)
    - [2.2.6 alphanumeric_ratio_filter](#226-alphanumeric_ratio_filter)
    - [2.2.7 stopwords_ratio_filter](#227-stopwords_ratio_filter)
    - [2.2.8 special_characters_filter](#228-special_characters_filter)
    - [2.2.9 language_id_filter](#229-language_id_filter)
    - [2.2.10 text_action_filter](#2210-text_action_filter)
- [2.2.11
text_entity_dependency_filter](#2211-text_entity_dependency_filter)
- [2.2.12
char_ngram_repetition_filter](#2212-char_ngram_repetition_filter)
- [2.2.13
word_ngram_repetition_filter](#2213-word_ngram_repetition_filter)
    - [2.2.14 conversation_hash_filter](#2214-conversation_hash_filter)
- [2.2.14.1
simhash_duplicate_operator](#22141-simhash_duplicate_operator)
- [2.2.14.2
minhash_duplicate_operator](#22142-minhash_duplicate_operator)
    - [2.2.15 llm_judge_filter](#2215-llm_judge_filter)
  - [2.3 图像过滤算子](#23-图像过滤算子)
    - [2.3.1 image_filesize_filter](#231-image_filesize_filter)
    - [2.3.2 image_ration_filter](#232-image_ration_filter)
    - [2.3.3 image_resolution_filter](#233-image_resolution_filter)
    - [2.3.4 image_hash_filter](#234-image_hash_filter)
  - [2.4 图文过滤算子](#24-图文过滤算子)
    - [2.4.1 image_clip_filter](#241-image_clip_filter)
- [3. 分析算子](#3-分析算子)
  - [3.1 基础分析算子](#31-基础分析算子)
    - [3.1.1 base_analysis_pipeline](#311-base_analysis_pipeline)
- [3.1.1.1 analyze_dataset_statistics](#3111-analyze_dataset_statistics)
- [3.1.1.2
analyze_language_distribution](#3112-analyze_language_distribution)
      - [3.1.1.3 analyze_image_paths](#3113-analyze_image_paths)
      - [3.1.1.4 analyze_data_anomalies](#3114-analyze_data_anomalies)
- [3.1.1.5
analyze_conversation_tokens](#3115-analyze_conversation_tokens)
  - [3.2 进阶分析算子](#32-进阶分析算子)
    - [3.2.1 description_analysis](#321-description_analysis)
    - [3.2.2 quality_analysis](#322-quality_analysis)
- [4. 可视化算子](#4-可视化算子)
  - [4.1 lda可视化算子](#41-lda可视化算子)
    - [4.1.1 lda_topic_clustering](#411-lda_topic_clustering)
- [5. 生成算子](#5-生成算子)
  - [5.1 多模态生成算子](#51-多模态生成算子)
    - [5.1.1 generate_qna_for_images](#511-generate_qna_for_images)



--- 
- #1055
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant