Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
159 commits
Select commit Hold shift + click to select a range
37d27f9
initial commit
1649759610 Jun 24, 2022
d65208e
refine readme
1649759610 Jun 24, 2022
ac4d644
refine codestyle
1649759610 Jun 24, 2022
3f433b9
refine readme
1649759610 Jun 24, 2022
d3f6ada
refine readme
1649759610 Jun 24, 2022
54ed34b
fix model saving bug
1649759610 Jun 26, 2022
63b0a76
Merge branch 'develop' into develop
Jul 6, 2022
4669194
initial commit
1649759610 Jul 11, 2022
f6f93e1
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Jul 11, 2022
fb3ade1
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Jul 11, 2022
a83a902
initial commit
1649759610 Jul 11, 2022
7bd988a
initial commit
1649759610 Jul 12, 2022
700810a
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Jul 12, 2022
68e025a
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Jul 21, 2022
02a997b
use common metric instead of eval_metrics.py and remove unuseful code
1649759610 Jul 28, 2022
1500e5f
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Jul 28, 2022
faaf5f5
Merge branch 'develop' into develop
1649759610 Aug 1, 2022
6a512a0
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Aug 2, 2022
a99fc68
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Aug 12, 2022
4b5fa30
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Nov 1, 2022
b837b67
mv stage project to ASO_analysis
1649759610 Nov 7, 2022
f415740
add unified sentiment analysis
1649759610 Nov 7, 2022
41b020d
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Nov 7, 2022
252860d
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Nov 7, 2022
aed2d64
refine readme
1649759610 Nov 7, 2022
8899a12
refine readme
1649759610 Nov 7, 2022
425a273
refnie readme
1649759610 Nov 7, 2022
acd9add
add unified sentiment analysis
1649759610 Nov 7, 2022
4796016
refine readme
1649759610 Nov 7, 2022
e857c6a
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Nov 21, 2022
88649f4
initial commit
1649759610 Nov 25, 2022
4141916
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Nov 25, 2022
12628a7
initial commit
1649759610 Nov 25, 2022
70ba157
refine readme
1649759610 Nov 28, 2022
531428e
add taskflow for sentiment analysis with UIE
1649759610 Nov 28, 2022
8e8b853
refine Readme
1649759610 Nov 29, 2022
410f0e4
refine readme.md
1649759610 Nov 30, 2022
64edd33
support sentiment analysis (UIE) with inputing by file format
1649759610 Nov 30, 2022
1f9637c
refine readme
1649759610 Nov 30, 2022
93509cd
delete predict scripts
1649759610 Nov 30, 2022
6473b10
refine readme
1649759610 Nov 30, 2022
09d0f12
delete unuseful files
1649759610 Nov 30, 2022
e546952
add pipeline for sentiment_analysis
1649759610 Dec 6, 2022
447fc14
merging with the newest code
1649759610 Dec 6, 2022
83252b0
merging code with the newest code
1649759610 Dec 6, 2022
f57df8c
fix to convert data without synonyms
1649759610 Dec 9, 2022
249a8a9
add senta pipeline
1649759610 Dec 9, 2022
cd3f4e7
refine readme
1649759610 Dec 9, 2022
a1de96d
drop functions: inputting file and saving results
1649759610 Dec 9, 2022
a5f83b1
add UIE-seta-[base, medium, mini, micro, nano]
1649759610 Dec 9, 2022
1da02d0
modify .gitignore to trace deploy code
1649759610 Dec 9, 2022
c4c135a
add deploy with SimpleServer
1649759610 Dec 9, 2022
363963b
add debug mode
1649759610 Dec 12, 2022
a0c8608
fix debug mode
1649759610 Dec 12, 2022
5afa387
update the loading method of UIE
1649759610 Dec 12, 2022
7109f24
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Dec 12, 2022
bb7da20
refine readme
1649759610 Dec 12, 2022
1af076f
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Dec 12, 2022
8c104f2
fix bug caused by version updating
1649759610 Dec 12, 2022
332a1e4
fix hard coding for model name.
1649759610 Dec 12, 2022
5e550bd
refine codestyle
1649759610 Dec 13, 2022
98c7084
modify readme according the way of 'step by step'
1649759610 Dec 13, 2022
7ec344f
refine codestyl
1649759610 Dec 13, 2022
6af902d
change saving txt to json files
1649759610 Dec 13, 2022
a7f57ea
download font automatically when not input font_path
1649759610 Dec 13, 2022
7c331d2
change readme in the way 'step by step'
1649759610 Dec 13, 2022
d78f73b
add model prediction by batch
1649759610 Dec 13, 2022
e6f0359
add uie-senta-x to support_schema_list
1649759610 Dec 13, 2022
e17d624
update sentiment analysis in taskflow
1649759610 Dec 13, 2022
485044c
add prediction with saved offline model
1649759610 Dec 14, 2022
267509c
change the exception exposure way
1649759610 Dec 14, 2022
30e3044
add description for visual schema
1649759610 Dec 14, 2022
7c88879
delete comments
1649759610 Dec 14, 2022
3e8c777
remove comments
1649759610 Dec 14, 2022
f3429b7
remove unused code and comments
1649759610 Dec 14, 2022
6c2a712
convert uie-senta-x model params to fit ernie/uie
1649759610 Dec 14, 2022
6f15864
refine readme for sentiment analysis
1649759610 Dec 16, 2022
55276fa
add running time
1649759610 Dec 16, 2022
6c4fd93
refine readme for senta pipeline
1649759610 Dec 16, 2022
d0729c0
change uie-base to uie-senta-base
1649759610 Dec 16, 2022
46fc8fe
load uie-senta-x with auto module
1649759610 Dec 16, 2022
ee5938b
add deploy with SimpleServer
1649759610 Dec 16, 2022
46c417c
refine codestyle
1649759610 Dec 16, 2022
96bd92f
refine readme
1649759610 Dec 16, 2022
fbf6567
add uie-senta-x to support_schema_list
1649759610 Dec 16, 2022
ac7c5ed
fix hard coding for mdoel anme
1649759610 Dec 16, 2022
d21452b
refine codestyle
1649759610 Dec 16, 2022
4128753
refine codestyl
1649759610 Dec 16, 2022
4d072d7
refine codestyle
1649759610 Dec 16, 2022
661e944
refine codestyle
1649759610 Dec 16, 2022
77c090f
refine codestyle
1649759610 Dec 16, 2022
128d154
refine codestyle
1649759610 Dec 16, 2022
8c59f76
refine codestyle
1649759610 Dec 16, 2022
8651ae2
refine codestyle
1649759610 Dec 16, 2022
e14ff4a
refine codestyle
1649759610 Dec 16, 2022
aae9da1
fix senta response
1649759610 Dec 16, 2022
bb441ca
add uie_senta_x
1649759610 Dec 16, 2022
d99c204
refine codestyle
1649759610 Dec 19, 2022
5650404
remove lambda expressions
1649759610 Dec 19, 2022
1614c6c
add link of senta pipeline
1649759610 Dec 19, 2022
5f89ceb
refine codestyle
1649759610 Dec 19, 2022
9a9be2c
remove local path
1649759610 Dec 19, 2022
87782d3
Merge branch 'develop' into develop
1649759610 Dec 20, 2022
288aaab
fix typos
1649759610 Dec 20, 2022
92f1278
refine readme
1649759610 Dec 20, 2022
0190ca1
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Dec 20, 2022
485f3cf
refine readme
1649759610 Dec 22, 2022
9400392
Merge branch 'automodel' into develop
1649759610 Dec 22, 2022
432e5e2
load uie-senta-x with automodel
1649759610 Dec 22, 2022
422ac9a
remove commented code
1649759610 Dec 22, 2022
98779c9
restore auto
1649759610 Dec 22, 2022
99fe1cb
Merge branch 'develop' into develop
1649759610 Dec 22, 2022
22394fa
add link of hotel dataset to readme.
1649759610 Dec 26, 2022
48b20c5
add link for downloading test_hotel.txt
1649759610 Dec 26, 2022
93073c5
fix url problem for server and client
1649759610 Dec 26, 2022
9d1243f
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Dec 26, 2022
075a4ae
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Dec 26, 2022
5783f19
refine readme
1649759610 Dec 26, 2022
e61ad7b
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Dec 26, 2022
19265f3
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Dec 28, 2022
ff664d2
fix for senta_examples.py
1649759610 Dec 28, 2022
69932d3
update visualization function
1649759610 Dec 29, 2022
f922241
update visualization function
1649759610 Dec 29, 2022
0b5c224
refine readme and update visualization description
1649759610 Dec 29, 2022
7d618a8
update visualization function
1649759610 Dec 29, 2022
a62d8c5
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Dec 29, 2022
ce9825a
refine readme and update visualization function
1649759610 Dec 29, 2022
a3ec63a
change logger in PaddleNLP to log information
1649759610 Dec 29, 2022
c344b8c
fix running time for skep and uie
1649759610 Dec 30, 2022
df1ad55
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Dec 30, 2022
7f00796
fix bug to solve tokenizer updating problem
1649759610 Dec 30, 2022
1ea8757
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Dec 30, 2022
3123147
refine label-studio readme
1649759610 Jan 3, 2023
88060da
refine label-studio readme
1649759610 Jan 3, 2023
ab18fc8
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Jan 3, 2023
fc52a00
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Jan 3, 2023
ac89d49
refine label-studio readme
1649759610 Jan 3, 2023
579631b
optimize example construction for a, o, as, ao extraction task
1649759610 Jan 5, 2023
b4364cf
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Jan 5, 2023
30ea32f
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Jan 5, 2023
4bfab1c
add the labeling method for ext task: a, as, ao and so on.
1649759610 Jan 5, 2023
091e216
add note for visual_analysis.py
1649759610 Jan 5, 2023
0a5b991
change link for downloading data and refine log output
1649759610 Jan 6, 2023
070bfda
refine log output
1649759610 Jan 6, 2023
c34bedb
refine readme
1649759610 Jan 6, 2023
e5a94c2
expose options interface
1649759610 Jan 6, 2023
e624418
refine readme
1649759610 Jan 6, 2023
a021936
modify typos
1649759610 Jan 6, 2023
b754bb1
expose options for customing sentiment analysis
1649759610 Jan 6, 2023
561f607
README.md
1649759610 Jan 9, 2023
7a2b5ca
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Jan 10, 2023
1792554
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Jan 10, 2023
604382e
fix bug for param is_shuffle in label_studio.py
1649759610 Jan 12, 2023
40e70c4
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Jan 12, 2023
3acb29d
Merge branch 'develop' of https://github.com/1649759610/PaddleNLP int…
1649759610 Jan 12, 2023
7b1d031
[BugFix] Fix the param is_shuffle problem
1649759610 Jan 12, 2023
269468c
[BugFix] Fix the bool param is_shuffle problem
1649759610 Jan 12, 2023
3801098
[BugFix] Fix the bool param is_shuffle problem
1649759610 Jan 12, 2023
804f394
Merge branch 'PaddlePaddle:develop' into develop
1649759610 Jan 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion applications/information_extraction/label_studio.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,16 @@ def set_seed(seed):
np.random.seed(seed)


def str2bool(v):
"""Support bool type for argparse."""
if v.lower() in ("yes", "true", "t", "y", "1"):
return True
elif v.lower() in ("no", "false", "f", "n", "0"):
return False
else:
raise argparse.ArgumentTypeError("Unsupported value encountered.")


def do_convert():
set_seed(args.seed)

Expand Down Expand Up @@ -125,7 +135,7 @@ def _save_examples(save_dir, file_name, examples):
parser.add_argument("--task_type", choices=['ext', 'cls'], default="ext", type=str, help="Select task type, ext for the extraction task and cls for the classification task, defaults to ext.")
parser.add_argument("--options", default=["正向", "负向"], type=str, nargs="+", help="Used only for the classification task, the options for classification")
parser.add_argument("--prompt_prefix", default="情感倾向", type=str, help="Used only for the classification task, the prompt prefix for classification")
parser.add_argument("--is_shuffle", default=True, type=bool, help="Whether to shuffle the labeled dataset, defaults to True.")
parser.add_argument("--is_shuffle", default="True", type=str2bool, help="Whether to shuffle the labeled dataset, defaults to True.")
parser.add_argument("--layout_analysis", default=False, type=bool, help="Enable layout analysis to optimize the order of OCR result.")
parser.add_argument("--seed", type=int, default=1000, help="Random seed for initialization")
parser.add_argument("--separator", type=str, default='##', help="Used only for entity/aspect-level classification task, separator for entity label and classification label")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@

import numpy as np
import paddle
from utils import load_txt
from utils import load_txt, str2bool

from paddlenlp.utils.log import logger

Expand Down Expand Up @@ -727,7 +727,7 @@ def _save_examples(save_dir, file_name, examples):
parser.add_argument("--splits", default=[0.8, 0.1, 0.1], type=float, nargs="*", help="The ratio of samples in datasets. [0.6, 0.2, 0.2] means 60% samples used for training, 20% for evaluation and 20% for test.")
parser.add_argument("--task_type", choices=['ext', 'cls'], default="ext", type=str, help="Two task types [ext, cls] are supported, ext represents the aspect-based extraction task and cls represents the sentence-level classification task, defaults to ext.")
parser.add_argument("--options", type=str, nargs="+", help="Used only for the classification task, the options for classification")
parser.add_argument("--is_shuffle", default=True, type=bool, help="Whether to shuffle the labeled dataset, defaults to True.")
parser.add_argument("--is_shuffle", type=str2bool, default="True", help="Whether to shuffle the labeled dataset, defaults to True.")
parser.add_argument("--seed", type=int, default=1000, help="Random seed for initialization")

args = parser.parse_args()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import argparse
import json
import random
import re
Expand Down Expand Up @@ -50,6 +51,16 @@ def write_json_file(examples, save_path):
f.write(line + "\n")


def str2bool(v):
"""Support bool type for argparse."""
if v.lower() in ("yes", "true", "t", "y", "1"):
return True
elif v.lower() in ("no", "false", "f", "n", "0"):
return False
else:
raise argparse.ArgumentTypeError("Unsupported value encountered.")


def create_data_loader(dataset, mode="train", batch_size=1, trans_fn=None):
"""
Create dataloader.
Expand Down
13 changes: 8 additions & 5 deletions model_zoo/uie/doccano.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,16 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import os
import time
import argparse
import json
import os
import time
from decimal import Decimal

import numpy as np
from paddlenlp.utils.log import logger
from utils import convert_cls_examples, convert_ext_examples, set_seed, str2bool

from utils import set_seed, convert_ext_examples, convert_cls_examples
from paddlenlp.utils.log import logger


def do_convert():
Expand Down Expand Up @@ -100,6 +101,8 @@ def _save_examples(save_dir, file_name, examples):
indexes = np.random.permutation(len(raw_examples))
index_list = indexes.tolist()
raw_examples = [raw_examples[i] for i in indexes]
else:
index_list = list(range(len(raw_examples)))

i1, i2, _ = args.splits
p1 = int(len(raw_examples) * i1)
Expand Down Expand Up @@ -164,7 +167,7 @@ def _save_examples(save_dir, file_name, examples):
parser.add_argument("--task_type", choices=['ext', 'cls'], default="ext", type=str, help="Select task type, ext for the extraction task and cls for the classification task, defaults to ext.")
parser.add_argument("--options", default=["正向", "负向"], type=str, nargs="+", help="Used only for the classification task, the options for classification")
parser.add_argument("--prompt_prefix", default="情感倾向", type=str, help="Used only for the classification task, the prompt prefix for classification")
parser.add_argument("--is_shuffle", default=True, type=bool, help="Whether to shuffle the labeled dataset, defaults to True.")
parser.add_argument("--is_shuffle", default="True", type=str2bool, help="Whether to shuffle the labeled dataset, defaults to True.")
parser.add_argument("--seed", type=int, default=1000, help="Random seed for initialization")
parser.add_argument("--separator", type=str, default='##', help="Used only for entity/aspect-level classification task, separator for entity label and classification label")
parser.add_argument("--schema_lang", choices=["ch", "en"], default="ch", help="Select the language type for schema.")
Expand Down
11 changes: 11 additions & 0 deletions model_zoo/uie/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import argparse
import json
import math
import random
Expand All @@ -30,6 +31,16 @@ def set_seed(seed):
np.random.seed(seed)


def str2bool(v):
"""Support bool type for argparse."""
if v.lower() in ("yes", "true", "t", "y", "1"):
return True
elif v.lower() in ("no", "false", "f", "n", "0"):
return False
else:
raise argparse.ArgumentTypeError("Unsupported value encountered.")


def create_data_loader(dataset, mode="train", batch_size=1, trans_fn=None):
"""
Create dataloader.
Expand Down