Skip to content

Commit e6b2afe

Browse files
authored
dumper concurrent reading (#19)
* dumper concurrent reading * fix * update ipv6 regexp rule
1 parent c446a8f commit e6b2afe

File tree

18 files changed

+791
-35
lines changed

18 files changed

+791
-35
lines changed

cmd/ips/cmd_dump.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ func init() {
3737
dumpCmd.Flags().StringVarP(&readerOption, "input-option", "", "", UsageReaderOption)
3838
dumpCmd.Flags().StringVarP(&hybridMode, "hybrid-mode", "", "aggregation", UsageHybridMode)
3939
dumpCmd.Flags().StringVarP(&outputFile, "output-file", "o", "", UsageDumpOutputFile)
40+
dumpCmd.Flags().IntVarP(&readerJobs, "reader-jobs", "", 0, UsageReaderJobs)
4041

4142
}
4243

cmd/ips/cmd_pack.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ func init() {
3737
packCmd.Flags().StringVarP(&outputFile, "output-file", "o", "", UsagePackOutputFile)
3838
packCmd.Flags().StringVarP(&outputFormat, "output-format", "", "", UsagePackOutputFormat)
3939
packCmd.Flags().StringVarP(&writerOption, "output-option", "", "", UsageWriterOption)
40+
packCmd.Flags().IntVarP(&readerJobs, "reader-jobs", "", 0, UsageReaderJobs)
4041

4142
}
4243

cmd/ips/config.go

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,9 @@ var (
118118
// writerOption specifies the options for the writer.
119119
writerOption string
120120

121+
// readerJobs specifies the number of concurrent reader jobs.
122+
readerJobs int
123+
121124
// myip
122125
// localAddr specifies the local address (in IP format) that should be used for outbound connections.
123126
// Useful in systems with multiple network interfaces.
@@ -237,6 +240,10 @@ func GetFlagConfig() *ips.Config {
237240
conf.WriterOption = writerOption
238241
}
239242

243+
if readerJobs != 0 {
244+
conf.ReaderJobs = readerJobs
245+
}
246+
240247
if len(addr) != 0 {
241248
conf.Addr = addr
242249
}

cmd/ips/const.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ const (
4646
UsageReaderOption = "Additional options for the database reader, if applicable."
4747
UsageWriterOption = "Additional options for the database writer, if applicable."
4848
UsageHybridMode = "Sets mode for multi-IP source handling; 'comparison' to compare, 'aggregation' to merge data."
49+
UsageReaderJobs = "Set the number of concurrent reader jobs. This parameter controls the parallelism level of reading operations."
4950

5051
// Output Flags
5152

docs/config.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
* [dp_rewriter_files](#dprewriterfiles)
3030
* [reader_option](#readeroption)
3131
* [writer_option](#writeroption)
32+
* [reader_jobs](#readerjobs)
3233
* [myip_count](#myipcount)
3334
* [myip_timeout_s](#myiptimeouts)
3435
* [addr](#addr)
@@ -330,6 +331,16 @@ $ ips config set rewrite_files "/path/to/rewrite1.txt,/path/to/rewrite2.txt"
330331

331332
例如 `mmdb` 数据库的 `select_languages` 等,具体功能请查阅数据库文档。
332333

334+
### reader_jobs
335+
336+
`reader_jobs` 参数用于控制读取操作的并发作业数量。它定义了可以同时进行的读取操作的最大数目,从而实现高效的数据处理。
337+
338+
默认情况下无需设置 `reader_jobs`,IPS 将根据系统的 CPU 核心数与输出格式自动设置。
339+
340+
值得注意的是,如果并发数设置过高,可能导致系统资源竞争加剧,进而影响程序的整体性能表现。
341+
342+
在某些情况下,增加读取器的并发数并不会带来性能提升。实际上,任务完成时间取决于读取器和写入器中的较慢的一方,尤其是大部分写入器(例如 IPDB 和 MMDB)目前还不支持并发写入,请根据自身情况选择适合的并发数。
343+
333344
### myip_count
334345

335346
在查询本机 IP 地址时,此参数定义了返回相同 IP 地址的最小探测器数量。默认值为 `3`

docs/config_en.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
* [dp_rewriter_files](#dprewriterfiles)
3030
* [reader_option](#readeroption)
3131
* [writer_option](#writeroption)
32+
* [reader_jobs](#readerjobs)
3233
* [myip_count](#myipcount)
3334
* [myip_timeout_s](#myiptimeouts)
3435
* [addr](#addr)
@@ -332,6 +333,16 @@ Some database formats provide additional writing options, which can be set durin
332333

333334
For example, `mmdb` database's `select_languages` and so on, please refer to the database documentation for specific functions.
334335

336+
### reader_jobs
337+
338+
The `reader_jobs` parameter is designed to control the number of concurrent jobs for reading operations. It specifies the maximum number of reading operations that can be performed simultaneously, thereby enhancing the efficiency of data processing.
339+
340+
By default, setting `reader_jobs` is not necessary, as IPS will automatically determine the appropriate number based on the system's CPU core count and the output format.
341+
342+
It's important to note that setting an excessively high number of concurrent jobs may lead to intensified competition for system resources, potentially degrading the overall performance of the program.
343+
344+
In some cases, increasing the concurrency of readers does not result in performance improvement. In fact, the completion time of tasks depends on the slower of the readers and writers. This is particularly relevant as most writers (such as IPDB and MMDB) currently do not support concurrent writing. Therefore, choose a suitable concurrency level based on your specific circumstances.
345+
335346
### myip_count
336347

337348
When querying the local IP address, this parameter defines the minimum number of detectors that return the same IP address. The default value is `3`.

format/ipdb/writer.go

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,3 +221,8 @@ func IntToBinaryBE(num, length int) []byte {
221221
return []byte{}
222222
}
223223
}
224+
225+
// WriterFormat returns the format of the writer.
226+
func (w *Writer) WriterFormat() string {
227+
return DBFormat
228+
}

format/mmdb/writer.go

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,11 @@ func (w *Writer) WriteTo(iw io.Writer) (int64, error) {
119119
return w.writer.WriteTo(iw)
120120
}
121121

122+
// WriterFormat returns the format of the writer.
123+
func (w *Writer) WriterFormat() string {
124+
return DBFormat
125+
}
126+
122127
// ConvertMap converts fields and values to a map.
123128
func (w *Writer) ConvertMap(fields, values []string) map[string]interface{} {
124129
ret := make(map[string]interface{})

format/plain/writer.go

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,3 +112,8 @@ func (w *Writer) Header() error {
112112

113113
return nil
114114
}
115+
116+
// WriterFormat returns the format of the writer.
117+
func (w *Writer) WriterFormat() string {
118+
return DBFormat
119+
}

format/writer.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,9 @@ type Writer interface {
3838

3939
// WriteTo writes data to io.Writer
4040
WriteTo(w io.Writer) (int64, error)
41+
42+
// WriterFormat returns the format of the writer.
43+
WriterFormat() string
4144
}
4245

4346
// NewWriter creates a Writer based on its format or file name.

0 commit comments

Comments
 (0)