Skip to content

Commit 9432615

Browse files
authored
Merge pull request #4812 from FederatedAI/develop-1.11.1
Update documents
2 parents 5fa5522 + faa4da8 commit 9432615

21 files changed

+93
-608
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ Deploying FATE to multiple nodes to achieve scalability, reliability and managea
3939
- [Train & Predict Hetero SecureBoost with FATE-Pipeline](./doc/tutorial/pipeline/pipeline_tutorial_hetero_sbt.ipynb)
4040
- [Build & Customize NN models with FATE-Pipeline](./doc/tutorial/pipeline/nn_tutorial/README.md)
4141
- [Run Job with DSL json conf](doc/tutorial/dsl_conf/dsl_conf_tutorial.md)
42+
- [FATE-LLM Training Guides](doc/tutorial/fate_llm/README.md)
4243
- [More Tutorials...](doc/tutorial)
4344

4445
## Related Repositories (Projects)

README_zh.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ FATE 支持多种部署模式,用户可以根据自身情况进行选择。[
3636
- [使用FATE-Pipeline训练及预测纵向SBT任务](./doc/tutorial/pipeline/pipeline_tutorial_hetero_sbt.ipynb)
3737
- [使用FATE-Pipeline构建横、纵向神经网络模型](doc/tutorial/pipeline/nn_tutorial/README.md)
3838
- [使用DSL json conf运行任务](doc/tutorial/dsl_conf/dsl_conf_tutorial.md)
39+
- [FATE-LLM训练教程](doc/tutorial/fate_llm/README.md)
3940
- [更多教程](doc/tutorial)
4041

4142
## 关联仓库

doc/federatedml_component/intersect.md

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -40,13 +40,6 @@ finding common even ids.
4040
With RSA intersection, participants can get their intersection ids
4141
securely and efficiently.
4242

43-
## RAW Intersection
44-
45-
This mode implements the simple intersection method in which a
46-
participant sends all its ids to another participant, and the other
47-
participant finds their common ids. Finally, the joining role will send
48-
the intersection ids to the sender.
49-
5043
## DH Intersection
5144

5245
This mode implements secure intersection based on symmetric encryption
@@ -88,7 +81,7 @@ Intersection support cache.
8881

8982
## Multi-Host Intersection
9083

91-
RSA, RAW, and DH intersection support multi-host scenario. It means a
84+
RSA, and DH intersection support multi-host scenario. It means a
9285
guest can perform intersection with more than one host simultaneously
9386
and get the common ids among all participants.
9487

@@ -155,14 +148,13 @@ And for Host:
155148

156149
## Feature
157150

158-
Below lists features of each ECDH, RSA, DH, and RAW intersection methods.
151+
Below lists features of each ECDH, RSA and DH intersection methods.
159152

160153
| Intersect Methods | PSI | Match-ID Support | Multi-Host | Exact-Cardinality | Estimated Cardinality | Preprocessing | Cache |
161154
|------------------- |------------------------------------------------------------------------- |------------------------------------------------------------------------ |------------------------------------------------------------------------------ |------------------------------------------------------------------------------------------------ |------------------------------------------------------------------------------------ |-------------------------------------------------------------------------------------- |------------------------------------------------------------------------------- |
162155
| ECDH | [✓](../../examples/pipeline/intersect/pipeline-intersect-ecdh.py) | ✓ | [✓](../../examples/pipeline/intersect/pipeline-intersect-ecdh-multi) | [✓](../../examples/dsl/v2/intersect/test_intersect_job_ecdh_exact_cardinality_conf.json) | ✗ | [✓](../../examples/pipeline/intersect/pipeline-intersect-ecdh-w-preprocess.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-ecdh-cache.py) |
163156
| RSA | [✓](../../examples/pipeline/intersect/pipeline-intersect-rsa.py) | [✓](../../examples/pipeline/match_id_test/pipeline-hetero-lr.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-multi-rsa.py) | ✗ | [✓](../../examples/pipeline/intersect/pipeline-intersect-rsa-cardinality.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-dh-w-preprocess.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-rsa-cache.py) |
164157
| DH | [✓](../../examples/pipeline/intersect/pipeline-intersect-dh.py) | ✓ | [✓](../../examples/pipeline/intersect/pipeline-intersect-dh-multi.py) | [✓](examples/dsl/v2/intersect/test_intersect_job_dh_exact_cardinality_conf.json) | ✗ | [✓](../../examples/pipeline/intersect/pipeline-intersect-rsa-w-preprocess.py) | [✓](../../examples/pipeline/intersect/pipeline-intersect-dh-cache.py) |
165-
| RAW | ✓ | ✓ | ✓ | ✗ | ✗ | ✓ | ✗ |
166158

167159
All four methods support:
168160

@@ -180,10 +172,6 @@ RSA, DH, ECDH intersection methods also support:
180172

181173
1. PSI with cache
182174

183-
RAW intersection supports the following extra feature:
184-
185-
1. base64 encoding may be used for all hashing methods.
186-
187175
Cardinality Computation:
188176

189177
1. Set `cardinality_method` to `rsa` will produce estimated intersection cardinality;

doc/tutorial/README.zh.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@
88
- [`Pipeline` 进行 `Hetero SecureBoost` 训练和预测](pipeline/pipeline_tutorial_hetero_sbt.ipynb)
99
- [`Pipeline` 构建神经网络模型](pipeline/nn_tutorial/README.md)
1010
- [`Pipeline` 进行带 `Match ID``Hetero SecureBoost` 训练和预测](pipeline/pipeline_tutorial_match_id.ipynb)
11+
- [上传带 `Meta` 的数据及`Hetero SecureBoost`训练](pipeline/pipeline_tutorial_uploading_data_with_meta.ipynb)
12+
- [多列匹配ID时指定特定列求交任务](pipeline/pipeline_tutorial_multiple_id_columns.ipynb)
1113

1214
不使用 `Pipeline` 来提交任务也是支持的,用户需要配置一些 `json` 格式的任务配置文件:
1315

@@ -22,3 +24,6 @@
2224
`FATE-Test` 跑多个任务:
2325

2426
- [FATE-Test 教程](fate_test_tutorial.md)
27+
28+
多方模型合并并导出为 sklearn/LightGBM 或者 PMML 格式:
29+
- [模型合并导出](./model_merge.md)

doc/tutorial/pipeline/nn_tutorial/GPT2-example.ipynb renamed to doc/tutorial/fate_llm/GPT2-example.ipynb

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,15 @@
55
"cell_type": "markdown",
66
"metadata": {},
77
"source": [
8-
"# Federated GPT-2 Tuning with Parameter Efficient methods in FATE-1.11"
8+
"# Federated GPT-2 Tuning with Parameter Efficient methods in FATE-LLM"
99
]
1010
},
1111
{
1212
"attachments": {},
1313
"cell_type": "markdown",
1414
"metadata": {},
1515
"source": [
16-
"In this tutorial, we will demonstrate how to efficiently train federated large language models using the FATE 1.11 framework. In FATE-1.11, we introduce the \"pellm\"(Parameter Efficient Large Language Model) module, specifically designed for federated learning with large language models. We enable the implementation of parameter-efficient methods in federated learning, reducing communication overhead while maintaining model performance. In this tutorial we particularlly focus on GPT-2, and we will also emphasize the use of the Adapter mechanism for fine-tuning GPT-2, which enables us to effectively reduce communication volume and improve overall efficiency.\n",
16+
"In this tutorial, we will demonstrate how to efficiently train federated large language models using the FATE-LLM framework. In FATE-LLM, we introduce the \"pellm\"(Parameter Efficient Large Language Model) module, specifically designed for federated learning with large language models. We enable the implementation of parameter-efficient methods in federated learning, reducing communication overhead while maintaining model performance. In this tutorial we particularlly focus on GPT-2, and we will also emphasize the use of the Adapter mechanism for fine-tuning GPT-2, which enables us to effectively reduce communication volume and improve overall efficiency.\n",
1717
"\n",
1818
"By following this tutorial, you will learn how to leverage the FATE framework to rapidly fine-tune federated large language models, such as GPT-2, with ease and efficiency."
1919
]
@@ -600,7 +600,7 @@
600600
" padding_side=\"left\", return_input_ids=False, pad_token='<|endoftext|>')\n",
601601
"# TrainerParam\n",
602602
"trainer_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
603-
" data_loader_worker=8, secure_aggregate=False)\n",
603+
" data_loader_worker=8, secure_aggregate=True)\n",
604604
"\n",
605605
"\n",
606606
"nn_component = HomoNN(name='nn_0', model=model)\n",
@@ -660,7 +660,7 @@
660660
"outputs": [],
661661
"source": [
662662
"trainer_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
663-
" data_loader_worker=8, secure_aggregate=False, cuda=0)"
663+
" data_loader_worker=8, secure_aggregate=True, cuda=0)"
664664
]
665665
},
666666
{
@@ -690,11 +690,11 @@
690690
"outputs": [],
691691
"source": [
692692
"client_0_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
693-
" data_loader_worker=8, secure_aggregate=False, cuda=[0, 1, 2, 3])\n",
693+
" data_loader_worker=8, secure_aggregate=True, cuda=[0, 1, 2, 3])\n",
694694
"client_1_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
695-
" data_loader_worker=8, secure_aggregate=False, cuda=[0, 3, 4])\n",
695+
" data_loader_worker=8, secure_aggregate=True, cuda=[0, 3, 4])\n",
696696
"server_param = TrainerParam(trainer_name='fedavg_trainer', epochs=1, batch_size=8, \n",
697-
" data_loader_worker=8, secure_aggregate=False)\n",
697+
" data_loader_worker=8, secure_aggregate=True)\n",
698698
"\n",
699699
"# set parameter for client 1\n",
700700
"nn_component.get_party_instance(role='guest', party_id=guest_0).component_param(\n",

doc/tutorial/pipeline/nn_tutorial/GPT2-multi-task.ipynb renamed to doc/tutorial/fate_llm/GPT2-multi-task.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,15 @@
55
"cell_type": "markdown",
66
"metadata": {},
77
"source": [
8-
"# Multi-Task Federated Learning with GPT-2 using FATE-1.11"
8+
"# Multi-Task Federated Learning with GPT-2 using FATE-LLM"
99
]
1010
},
1111
{
1212
"attachments": {},
1313
"cell_type": "markdown",
1414
"metadata": {},
1515
"source": [
16-
"In this tutorial, we will explore the implementation of multi-task federated learning with LM: GPT-2 using the FATE-1.11 framework. FATE-1.11 provides the \"pellm\" module for efficient federated learning. It is specifically designed for large language models in a federated setting.\n",
16+
"In this tutorial, we will explore the implementation of multi-task federated learning with LM: GPT-2 using the FATE-LLM framework. FATE-LLM provides the \"pellm\" module for efficient federated learning. It is specifically designed for large language models in a federated setting.\n",
1717
"\n",
1818
"Multi-task learning involves training a model to perform multiple tasks simultaneously. In this tutorial, we will focus on two tasks - sentiment classification and named entity recognition (NER) - and show how they can be combined with GPT-2 in a federated learning setting. We will use the IMDB sentiment analysis dataset and the CoNLL-2003 NER dataset for our tasks.\n",
1919
"\n",
@@ -699,7 +699,7 @@
699699
"dataset_param = DatasetParam(dataset_name='multitask_ds', take_limits=50, tokenizer_name_or_path=model_path)\n",
700700
"# TrainerParam\n",
701701
"trainer_param = TrainerParam(trainer_name='multi_task_fedavg', epochs=1, batch_size=4, \n",
702-
" data_loader_worker=8, secure_aggregate=False)\n",
702+
" data_loader_worker=8, secure_aggregate=True)\n",
703703
"loss = t.nn.CustLoss(loss_module_name='multi_task_loss', class_name='MultiTaskLoss', task_weights=[0.5, 0.5])\n",
704704
"\n",
705705
"\n",

doc/tutorial/fate_llm/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Usage
2+
Here we provide tutorials of FATE-LLM training:
3+
4+
- [FATE-LLM example with GPT-2](GPT2-example.ipynb)
5+
- [FATE-LLM Multi-Task GPT-2: Classification and NER Tagging](GPT2-multi-task.ipynb)

doc/tutorial/pipeline/nn_tutorial/README.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,10 +66,9 @@ In order to show you how to develop your own Trainer, here we try to develop a s
6666

6767
Here we offer some advanced examples of using FATE-NN framework.
6868

69-
## Fed-PELLM(Parameter Efficient Large Language Model) Training
69+
## FATE-LLM(Federated Large Language Models) Training
7070

71-
- [Federated PELLM example with GPT-2](./GPT2-example.ipynb)
72-
- [Federated Multi-Task GPT-2: Classification and NER Tagging](./GPT2-multi-task.ipynb)
71+
- [FATE-LLM Training Guides](../../fate_llm/README.md)
7372

7473
## Resnet classification(Homo-NN)
7574

examples/dsl/v2/intersect/README.md

Lines changed: 27 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -4,103 +4,89 @@ This section introduces the dsl and conf for usage of different type of task.
44

55
#### Intersection Task.
66

7-
1. RAW Intersection:
8-
- dsl: test_intersect_job_dsl.json
9-
- runtime_config : test_intersect_job_raw_conf.json
10-
11-
2. RAW Intersection with SM3 Hashing:
12-
- dsl: test_intersect_job_dsl.json
13-
- runtime_config : test_intersect_job_raw_sm3_conf.json
14-
15-
3. RSA Intersection:
7+
1. RSA Intersection:
168
- dsl: test_intersect_job_dsl.json
179
- runtime_config : test_intersect_job_rsa_conf.json
1810

19-
4. RSA Intersection with Random Base Fraction set to 0.5:
11+
2. RSA Intersection with Random Base Fraction set to 0.5:
2012
- dsl: test_intersect_job_dsl.json
2113
- runtime_config : test_intersect_job_rsa_fraction_conf.json
2214

23-
5. RSA Intersection with Calculation Split:
15+
3. RSA Intersection with Calculation Split:
2416
- dsl: test_intersect_job_dsl.json
2517
- runtime_config : test_intersect_job_rsa_split_conf.json
2618

27-
6. RSA Multi-hosts Intersection:
19+
4. RSA Multi-hosts Intersection:
2820
- dsl: test_intersect_job_dsl.json
2921
- runtime_config : test_intersect_job_rsa_multi_host_conf.json
3022

3123
This dsl is an example of guest runs intersection with two hosts using rsa intersection. It can be used as more than two hosts.
3224

33-
7. RAW Multi-hosts Intersection:
34-
- dsl: test_intersect_job_dsl.json
35-
- runtime_config : test_intersect_job_raw_multi_host_conf.json
36-
37-
This dsl is an example of guest runs intersection with two hosts using rsa intersection. It can be used as more than two hosts.
38-
39-
8. DH Intersection:
25+
5. DH Intersection:
4026
- dsl: test_intersect_job_dsl.json
4127
- runtime_config : test_intersect_job_dh_conf.json
4228

43-
9. DH Multi-host Intersection:
29+
6. DH Multi-host Intersection:
4430
- dsl: test_intersect_job_dsl.json
4531
- runtime_config : test_intersect_job_dh_multi_conf.json
4632

47-
10. ECDH Intersection:
48-
- dsl: test_intersect_job_dsl.json
49-
- runtime_config : test_intersect_job_ecdh_conf.json
33+
7. ECDH Intersection:
34+
- dsl: test_intersect_job_dsl.json
35+
- runtime_config : test_intersect_job_ecdh_conf.json
5036

51-
11. ECDH Intersection with Preprocessing:
52-
- dsl: test_intersect_job_dsl.json
53-
- runtime_config : test_intersect_job_ecdh_w_preprocess_conf.json
37+
8. ECDH Intersection with Preprocessing:
38+
- dsl: test_intersect_job_dsl.json
39+
- runtime_config : test_intersect_job_ecdh_w_preprocess_conf.json
5440

55-
12. RSA Intersection with Cache:
56-
- dsl: test_intersect_job_cache_dsl.json
57-
- runtime_config : test_intersect_job_rsa_cache_conf.json
41+
9. RSA Intersection with Cache:
42+
- dsl: test_intersect_job_cache_dsl.json
43+
- runtime_config : test_intersect_job_rsa_cache_conf.json
5844

59-
13. DH Intersection with Cache:
45+
10. DH Intersection with Cache:
6046
- dsl: test_intersect_job_cache_dsl.json
6147
- runtime_config : test_intersect_job_dh_cache_conf.json
6248

63-
14. ECDH Intersection with Cache:
49+
11. ECDH Intersection with Cache:
6450
- dsl: test_intersect_job_cache_dsl.json
6551
- runtime_config : test_intersect_job_ecdh_cache_conf.json
6652

67-
15. RSA Intersection with Cache Loader:
53+
12. RSA Intersection with Cache Loader:
6854
- dsl: test_intersect_job_cache_loader_dsl.json
6955
- runtime_config : test_intersect_job_rsa_cache_loader_conf.json
7056

71-
16. Estimated Intersect Cardinality:
57+
13. Estimated Intersect Cardinality:
7258
- dsl: test_intersect_job_dsl.json
7359
- runtime_config: "test_intersect_job_rsa_cardinality_conf.json
7460

75-
17. Exact Intersect Cardinality with ECDH:
61+
14. Exact Intersect Cardinality with ECDH:
7662
- dsl: test_intersect_job_dsl.json
7763
- runtime_config: "test_intersect_job_ecdh_exact_cardinality_conf.json
7864

79-
18. Exact Intersect Cardinality with DH:
65+
15. Exact Intersect Cardinality with DH:
8066
- dsl: test_intersect_job_dsl.json
8167
- runtime_config: "test_intersect_job_dh_exact_cardinality_conf.json
8268

83-
19. DH Intersection with Preprocessing:
69+
16. DH Intersection with Preprocessing:
8470
- dsl: test_intersect_job_dsl.json
8571
- runtime_config : test_intersect_job_dh_w_preprocess_conf.json
8672

87-
20. RSA Intersection with Preprocessing:
73+
17. RSA Intersection with Preprocessing:
8874
- dsl: test_intersect_job_dsl.json
8975
- runtime_config : test_intersect_job_rsa_w_preprocess_conf.json
9076

91-
21. ECDH Intersection with Cache Loader:
77+
18. ECDH Intersection with Cache Loader:
9278
- dsl: test_intersect_job_cache_loader_dsl.json
9379
- runtime_config : test_intersect_job_ecdh_cache_loader_conf.json
9480

95-
22. Exact Multi-host Intersect Cardinality with ECDH:
81+
19. Exact Multi-host Intersect Cardinality with ECDH:
9682
- dsl: test_intersect_job_dsl.json
9783
- runtime_config: "test_intersect_job_ecdh_multi_exact_cardinality_conf.json
9884

99-
23. Exact Multi-host Intersect Cardinality with DH:
85+
20. Exact Multi-host Intersect Cardinality with DH:
10086
- dsl: test_intersect_job_dsl.json
10187
- runtime_config: "test_intersect_job_dh_multi_exact_cardinality_conf.json
10288

103-
24. Exact Multi-host Intersect with ECDH:
89+
21. Exact Multi-host Intersect with ECDH:
10490
- dsl: test_intersect_job_dsl.json
10591
- runtime_config: "test_intersect_job_ecdh_multi_conf.json
10692

examples/dsl/v2/intersect/intersect_testsuite.json

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -26,14 +26,6 @@
2626
}
2727
],
2828
"tasks": {
29-
"raw_intersect": {
30-
"conf": "./test_intersect_job_raw_conf.json",
31-
"dsl": "./test_intersect_job_dsl.json"
32-
},
33-
"raw_intersect_sm3": {
34-
"conf": "./test_intersect_job_raw_sm3_conf.json",
35-
"dsl": "./test_intersect_job_dsl.json"
36-
},
3729
"rsa_intersect": {
3830
"conf": "./test_intersect_job_rsa_conf.json",
3931
"dsl": "./test_intersect_job_dsl.json"
@@ -54,10 +46,6 @@
5446
"conf": "./test_intersect_job_rsa_w_preprocess_conf.json",
5547
"dsl": "./test_intersect_job_dsl.json"
5648
},
57-
"raw_intersect_multi_host": {
58-
"conf": "./test_intersect_job_raw_multi_host_conf.json",
59-
"dsl": "./test_intersect_job_dsl.json"
60-
},
6149
"dh_intersect": {
6250
"conf": "./test_intersect_job_dh_conf.json",
6351
"dsl": "./test_intersect_job_dsl.json"

0 commit comments

Comments
 (0)