General Library¶

Quick Glance¶

For io:

`longling.lib.stream.to_io`(stream, <class >, ...)	Convert an object as an io stream, could be a path to file or an io stream.
`longling.lib.stream.as_io`(src, <class >, ...)	with wrapper for to_io function, default mode is "r"
`longling.lib.stream.as_out_io`(tar, <class >, ...)	with wrapper for to_io function, default mode is "w"
`longling.lib.loading.loading`(src, <class >), ...)	缓冲式按行读取文件

迭代器

`longling.lib.iterator.AsyncLoopIter`(src[, ...])	异步循环迭代器,适用于加载文件
`longling.lib.iterator.CacheAsyncLoopIter`(...)	带缓冲池的异步迭代器，适用于带预处理的文件
`longling.lib.iterator.iterwrap`(itertype, ...)	迭代器装饰器，适用于希望能重复使用一个迭代器的情况，能将迭代器生成函数转换为可以重复使用的函数。默认使用 AsyncLoopIter。

日志

longling.lib.utilog.config_logging([...]) 主日志设定文件

For path

`longling.lib.path.path_append`(path, *addition)	路径合并函数
`longling.lib.path.abs_current_dir`(filepath)	获取文件所在目录的绝对路径
`longling.lib.path.file_exist`(filepath)	判断文件是否存在

语法糖

longling.lib.candylib.as_list(obj) A utility function that converts the argument to a list if it is not already.

计时与进度

`longling.lib.clock.print_time`(tips[, logger])	统计并打印脚本运行时间，秒为单位
`longling.lib.clock.Clock`(store_dict, ...[, tips])	计时器。包含两种时间：wall_time 和 process_time
`longling.lib.stream.flush_print`(*values, ...)	刷新打印函数

并发

longling.lib.concurrency.concurrent_pool(...) Simple api for start completely independent concurrent programs:

测试

longling.lib.testing.simulate_stdin(*inputs) 测试中模拟标准输入

结构体 .. autosummary:

longling.lib.structure.AttrDict
longling.lib.structure.nested_update
longling.lib.structure.SortedList

正则 .. autosummary:

longling.lib.structure.variable_replace
longling.lib.structure.default_variable_replace

candylib¶

longling.lib.candylib.as_list(obj) → list[源代码]¶

A utility function that converts the argument to a list if it is not already.

参数:	obj (object) -- argument to be converted to a list
返回:	list_obj -- If obj is a list or tuple, return it. Otherwise, return [obj] as a single-element list.
返回类型:	list

实际案例

>>> as_list(1)
[1]
>>> as_list([1])
[1]
>>> as_list((1, 2))
[1, 2]

longling.lib.candylib.dict2pv(dict_obj: dict, path_to_node: list = None)[源代码]¶

>>> dict_obj = {"a": {"b": [1, 2], "c": "d"}, "e": 1}
>>> path, value = dict2pv(dict_obj)
>>> path
[['a', 'b'], ['a', 'c'], ['e']]
>>> value
[[1, 2], 'd', 1]

longling.lib.candylib.list2dict(list_obj, value=None, dict_obj=None)[源代码]¶

>>> list_obj = ["a", 2, "c"]
>>> list2dict(list_obj, 10)
{'a': {2: {'c': 10}}}

longling.lib.candylib.get_dict_by_path(dict_obj, path_to_node)[源代码]¶

>>> dict_obj = {"a": {"b": {"c": 1}}}
>>> get_dict_by_path(dict_obj, ["a", "b", "c"])
1

longling.lib.candylib.format_byte_sizeof(num, suffix='B')[源代码]¶

实际案例

>>> format_byte_sizeof(1024)
'1.00KB'

longling.lib.candylib.group_by_n(obj: list, n: int) → list[源代码]¶

实际案例

>>> list_obj = [1, 2, 3, 4, 5, 6]
>>> group_by_n(list_obj, 3)
[[1, 2, 3], [4, 5, 6]]

longling.lib.candylib.as_ordered_dict(dict_data: (<class 'dict'>, <class 'collections.OrderedDict'>), index: (<class 'list'>, None) = None)[源代码]¶

实际案例

>>> as_ordered_dict({0: 0, 2: 123, 1: 1})
OrderedDict([(0, 0), (2, 123), (1, 1)])
>>> as_ordered_dict({0: 0, 2: 123, 1: 1}, [2, 0, 1])
OrderedDict([(2, 123), (0, 0), (1, 1)])
>>> as_ordered_dict(OrderedDict([(2, 123), (0, 0), (1, 1)]))
OrderedDict([(2, 123), (0, 0), (1, 1)])

clock¶

class longling.lib.clock.Clock(store_dict: (<class 'dict'>, None) = None, logger: (<class 'logging.Logger'>, None) = <Logger clock (INFO)>, tips='')[源代码]¶

计时器。包含两种时间：wall_time 和 process_time

wall_time: 包括等待时间在内的程序运行时间
process_time: 不包括等待时间在内的程序运行时间

参数:	store_dict (dict or None) -- with closure 中存储运行时间 logger (logging.logger) -- 日志 tips (str) -- 提示前缀

实际案例

with Clock():
    a = 1 + 1
clock = Clock()
clock.start()
# some code
clock.end(wall=True) # default to return the wall_time, to get process_time, set wall=False

end(wall=True)[源代码]¶: 计时结束，返回间隔时间

process_time¶: 获取程序运行时间（不包括等待时间）

start()[源代码]¶: 开始计时

wall_time¶: 获取程序运行时间（包括等待时间）

longling.lib.clock.print_time(tips: str = '', logger=<Logger clock (INFO)>)[源代码]¶

统计并打印脚本运行时间，秒为单位

参数:	tips (str) -- logger (logging.Logger or logging) --

实际案例

>>> with print_time("tips"):
...     a = 1 + 1  # The code you want to test

longling.lib.clock.Timer¶: longling.lib.clock.Clock 的别名

concurrency¶

longling.lib.concurrency.concurrent_pool(level: str, pool_size: int = None, ret: list = None)[源代码]¶

Simple api for start completely independent concurrent programs:

thread
process
coroutine

实际案例

def pseudo(idx):
    return idx

ret = []
with concurrent_pool("p", ret=ret) as e:  # or concurrent_pool("t", ret=ret)
     for i in range(4):
        e.submit(pseudo, i)
print(ret)

[0, 1, 2, 3]

class longling.lib.concurrency.ThreadPool(max_workers=None, thread_name_prefix='', ret: list = None)[源代码]¶

submit(fn, *args, **kwargs)[源代码]¶

Submits a callable to be executed with the given arguments.

Schedules the callable to be executed as fn(*args, **kwargs) and returns a Future instance representing the execution of the callable.

返回:	A Future representing the given call.

class longling.lib.concurrency.ProcessPool(processes=None, *args, ret: list = None, **kwargs)[源代码]¶

formatter¶

longling.lib.formatter.dict_format(data: dict, digits=6, col: int = None)[源代码]¶

实际案例

>>> print(dict_format({"a": 123, "b": 3, "c": 4, "d": 5}))  # doctest: +NORMALIZE_WHITESPACE
a: 123      b: 3    c: 4    d: 5
>>> print(dict_format({"a": 123, "b": 3, "c": 4, "d": 5}, col=3))  # doctest: +NORMALIZE_WHITESPACE
a: 123      b: 3    c: 4
d: 5

longling.lib.formatter.pandas_format(data: (<class 'dict'>, <class 'list'>, <class 'tuple'>), columns: list = None, index: (<class 'list'>, <class 'str'>) = None, orient='index', pd_kwargs: dict = None, max_rows=80, max_columns=80, **kwargs)[源代码]¶

参数:

data (dict, list, tuple, pd.DataFrame) --
columns (list, default None) -- Column labels to use when orient='index'. Raises a ValueError if used with orient='columns'.
index (list of strings) -- Optional display names matching the labels (same order).
orient ({'columns', 'index'}, default 'columns') -- The "orientation" of the data. If the keys of the passed dict should be the columns of the resulting DataFrame, pass 'columns' (default). Otherwise if the keys should be rows, pass 'index'.
pd_kwargs (dict) --
max_rows ((int, None), default 80) --
max_columns ((int, None), default 80) --

实际案例

>>> print(pandas_format({"a": {"x": 1, "y": 2}, "b": {"x": 1.0, "y": 3}},  ["x", "y"]))
     x  y
a  1.0  2
b  1.0  3
>>> print(pandas_format([[1.0, 2], [1.0, 3]],  ["x", "y"], index=["a", "b"]))
     x  y
a  1.0  2
b  1.0  3

longling.lib.formatter.table_format(data: (<class 'dict'>, <class 'list'>, <class 'tuple'>), columns: list = None, index: (<class 'list'>, <class 'str'>) = None, orient='index', pd_kwargs: dict = None, max_rows=80, max_columns=80, **kwargs)¶

参数:

data (dict, list, tuple, pd.DataFrame) --
columns (list, default None) -- Column labels to use when orient='index'. Raises a ValueError if used with orient='columns'.
index (list of strings) -- Optional display names matching the labels (same order).
orient ({'columns', 'index'}, default 'columns') -- The "orientation" of the data. If the keys of the passed dict should be the columns of the resulting DataFrame, pass 'columns' (default). Otherwise if the keys should be rows, pass 'index'.
pd_kwargs (dict) --
max_rows ((int, None), default 80) --
max_columns ((int, None), default 80) --

实际案例

>>> print(pandas_format({"a": {"x": 1, "y": 2}, "b": {"x": 1.0, "y": 3}},  ["x", "y"]))
     x  y
a  1.0  2
b  1.0  3
>>> print(pandas_format([[1.0, 2], [1.0, 3]],  ["x", "y"], index=["a", "b"]))
     x  y
a  1.0  2
b  1.0  3

longling.lib.formatter.series_format(data: dict, digits=6, col: int = None)¶

实际案例

>>> print(dict_format({"a": 123, "b": 3, "c": 4, "d": 5}))  # doctest: +NORMALIZE_WHITESPACE
a: 123      b: 3    c: 4    d: 5
>>> print(dict_format({"a": 123, "b": 3, "c": 4, "d": 5}, col=3))  # doctest: +NORMALIZE_WHITESPACE
a: 123      b: 3    c: 4
d: 5

iterator¶

class longling.lib.iterator.BaseIter(src, fargs=None, fkwargs=None, length=None, *args, **kwargs)[源代码]¶

迭代器

Notes

如果 src 是一个迭代器实例，那么在一轮迭代之后，迭代器里的内容就被迭代完了，将无法重启。
如果想使得迭代器可以一直被循环迭代，那么 src 应当是迭代器实例的生成函数, 同时在每次循环结束后，调用reset()
如果 src 没有 __length__，那么在第一次迭代结束前，无法对 BaseIter 的实例调用 len() 函数

实际案例

# 单次迭代后穷尽内容
with open("demo.txt") as f:
    bi = BaseIter(f)
    for line in bi:
        pass

# 可多次迭代
def open_file():
    with open("demo.txt") as f:
        for line in f:
            yield line

bi = BaseIter(open_file)
for _ in range(5):
    for line in bi:
        pass
    bi.reset()

# 简化的可多次迭代的写法
@BaseIter.wrap
def open_file():
    with open("demo.txt") as f:
        for line in f:
            yield line

bi = open_file()
for _ in range(5):
    for line in bi:
        pass
    bi.reset()

class longling.lib.iterator.MemoryIter(src, fargs=None, fkwargs=None, length=None, prefetch=False, *args, **kwargs)[源代码]¶

内存迭代器

会将所有迭代器内容装载入内存

class longling.lib.iterator.LoopIter(src, fargs=None, fkwargs=None, length=None, *args, **kwargs)[源代码]¶

循环迭代器

每次迭代后会进行自动的 reset() 操作

class longling.lib.iterator.AsyncLoopIter(src, fargs=None, fkwargs=None, tank_size=8, timeout=None, level='t')[源代码]¶

异步循环迭代器,适用于加载文件

数据的读入和数据的使用迭代是异步的。reset() 之后会进行数据预取

class longling.lib.iterator.AsyncIter(src, fargs=None, fkwargs=None, tank_size=8, timeout=None, level='t')[源代码]¶

异步装载迭代器

不会进行自动 reset()

class longling.lib.iterator.CacheAsyncLoopIter(src, cache_file, fargs=None, fkwargs=None, rerun=True, tank_size=8, timeout=None, level='t')[源代码]¶

带缓冲池的异步迭代器，适用于带预处理的文件

自动 reset(), 同时针对 src 为 function 时可能存在的复杂预处理（即异步加载取数据操作比迭代输出数据操作时间长很多），将异步加载中处理的预处理数据放到指定的缓冲文件中

longling.lib.iterator.iterwrap(itertype: str = 'AsyncLoopIter', *args, **kwargs)[源代码]¶

迭代器装饰器，适用于希望能重复使用一个迭代器的情况，能将迭代器生成函数转换为可以重复使用的函数。默认使用 AsyncLoopIter。

实际案例

@iterwrap()
def open_file():
    with open("demo.txt") as f:
        for line in f:
            yield line

data = open_file()
for _ in range(5):
    for line in data:
        pass

警告

As mentioned in [1], on Windows or MacOS, spawn() is the default multiprocessing start method. Using spawn(), another interpreter is launched which runs your main script, followed by the internal worker function that receives parameters through pickle serialization. However, decorator ,`functools`, lambda and local function does not well fit pickle like discussed in [2]. Therefore, since version 1.3.36, instead of using multiprocessing, we use multiprocess which replace pickle with dill . Nevertheless, the users should be aware of that level='p' may not work in windows and mac platform if the decorated function does not follow the spawn() behaviour.

Notes

Although fork in multiprocessing is quite easy to use, and iterwrap can work well with it, the users should still be aware of that fork is not safety enough as mentioned in [3].

We use the default mode when deal with multiprocessing, i.e., spawn in windows and macos, and folk in linux. An example to change the default behaviour is multiprocessing.set_start_method('spawn'), which could be found in [3].

References

[1] https://pytorch.org/docs/stable/data.html#platform-specific-behaviors [2] https://stackoverflow.com/questions/51867402/cant- pickle-function-stringtongrams-at-0x104144f28-its-not-the-same-object [3] https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

loading¶

longling.lib.loading.csv2jsonl(src: ((<class 'str'>, <class 'pathlib.PurePath'>), (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, <class 'codecs.StreamReaderWriter'>, <class 'fileinput.FileInput'>)), tar: ((<class 'str'>, <class 'pathlib.PurePath'>), (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, <class 'codecs.StreamReaderWriter'>, <class 'fileinput.FileInput'>)) = None, delimiter=', ', **kwargs)[源代码]¶

将 csv 格式文件/io流转换为 json 格式文件/io流

transfer csv file or io stream into json file or io stream

参数:

src (PATH_IO_TYPE) -- 数据源，可以是文件路径，也可以是一个IO流。 the path to source file or io stream.
tar (PATH_IO_TYPE) -- 输出目标，可以是文件路径，也可以是一个IO流。 the path to target file or io stream.
delimiter (str) -- 分隔符 the delimiter used in csv. some usually used delimiters are "," and " "
kwargs (dict) -- options passed to csv.DictWriter

实际案例

Assume such component is written in demo.csv:

use following codes to reading the component

csv2json("demo.csv", "demo.jsonl")

and get

longling.lib.loading.jsonl2csv(src: ((<class 'str'>, <class 'pathlib.PurePath'>), (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, <class 'codecs.StreamReaderWriter'>, <class 'fileinput.FileInput'>)), tar: ((<class 'str'>, <class 'pathlib.PurePath'>), (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, <class 'codecs.StreamReaderWriter'>, <class 'fileinput.FileInput'>)) = None, delimiter=', ', **kwargs)[源代码]¶

将 json 格式文件/io流转换为 csv 格式文件/io流

transfer json file or io stream into csv file or io stream

参数:

src (PATH_IO_TYPE) -- 数据源，可以是文件路径，也可以是一个IO流。 the path to source file or io stream.
tar (PATH_IO_TYPE) -- 输出目标，可以是文件路径，也可以是一个IO流。 the path to target file or io stream.
delimiter (str) -- 分隔符 the delimiter used in csv. some usually used delimiters are "," and " "
kwargs (dict) -- options passed to csv.DictWriter

实际案例

Assume such component is written in demo.csv:

use following codes to reading the component

jsonl2csv("demo.csv", "demo.jsonl")

and get

longling.lib.loading.loading(src: (((<class 'str'>, <class 'pathlib.PurePath'>), (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, <class 'codecs.StreamReaderWriter'>, <class 'fileinput.FileInput'>)), Ellipsis), src_type=None)[源代码]¶

缓冲式按行读取文件

Support read from

jsonl (apply load_jsonl)
csv (apply load_csv).
file in other format will be treated as raw text (apply load_file).
function will be invoked and return
others will be directly returned

longling.lib.loading.load_jsonl(src: ((<class 'str'>, <class 'pathlib.PurePath'>), (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, <class 'codecs.StreamReaderWriter'>, <class 'fileinput.FileInput'>)))[源代码]¶

缓冲式按行读取jsonl文件

实际案例

Assume such component is written in demo.jsonl:

for line in load_jsonl('demo.jsonl'):
    print(line)

longling.lib.loading.load_csv(src: ((<class 'str'>, <class 'pathlib.PurePath'>), (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, <class 'codecs.StreamReaderWriter'>, <class 'fileinput.FileInput'>)), delimiter=', ', **kwargs)[源代码]¶

read the dict from csv

实际案例

Assume such component is written in demo.csv:

for line in load_csv('demo.csv'):
    print(line)

longling.lib.loading.load_file(src: ((<class 'str'>, <class 'pathlib.PurePath'>), (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, <class 'codecs.StreamReaderWriter'>, <class 'fileinput.FileInput'>)))[源代码]¶

Read raw text from source

实际案例

Assume such component is written in demo.txt:

use following codes to reading the component

for line in load_csv('demo.txt'):
    print(line, end="")

and get

parser¶

自定义的配置文件及对应的解析工具包。目的是为了更方便、快速地进行文件参数配置与解析。

longling.lib.parser.get_class_var(class_obj, exclude_names: (<class 'set'>, None) = None, get_vars=None) → dict[源代码]¶

Update in v1.3.18

获取某个类的所有属性的变量名及其值

实际案例

>>> class A(object):
...     att1 = 1
...     att2 = 2
>>> get_class_var(A)
{'att1': 1, 'att2': 2}
>>> get_class_var(A, exclude_names={"att1"})
{'att2': 2}
>>> class B(object):
...     att3 = 3
...     att4 = 4
...     @staticmethod
...     def excluded_names():
...         return {"att4"}
>>> get_class_var(B)
{'att3': 3}

参数:	class_obj -- 类或类实例。需要注意两者的区别。 exclude_names -- 需要排除在外的变量名。也可以通过在类定义 excluded_names 方法来指定要排除的变量名。 get_vars --
返回:	类内属性变量名及值
返回类型:	class_var

longling.lib.parser.get_parsable_var(class_obj, parse_exclude: set = None, dump_parse_functions=None, get_vars=True)[源代码]¶: 获取所有可以被解析的参数及其值，可以使用dump_parse_functions来对不可dump的值进行转换

longling.lib.parser.load_configuration(fp, file_format='json', load_parse_function=None)[源代码]¶

装载配置文件

Updated in version 1.3.16

参数:	fp -- file_format -- load_parse_function --

longling.lib.parser.var2exp(var_str, env_wrap=<function <lambda>>)[源代码]¶

将含有 $ 标识的变量转换为表达式

参数:	var_str -- env_wrap --

实际案例

>>> root = "dir"
>>> dataset = "d1"
>>> eval(var2exp("$root/data/$dataset"))
'dir/data/d1'

longling.lib.parser.path_append(path, *addition, to_str=False)[源代码]¶

路径合并函数

实际案例

path_append("../", "../data", "../dataset1/", "train", to_str=True) '../../data/../dataset1/train'

参数:	path (str or PurePath) -- addition (list(str or PurePath)) -- to_str (bool) -- Convert the new path to str

class longling.lib.parser.Configuration(logger=<module 'logging' from '/home/docs/.pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py'>, **kwargs)[源代码]¶

自定义的配置文件基类

实际案例

>>> c = Configuration(a=1, b="example", c=[0,2], d={"a1": 3})
>>> c.instance_var
{'a': 1, 'b': 'example', 'c': [0, 2], 'd': {'a1': 3}}
>>> c.default_file_format()
'json'
>>> c.get("a")
1
>>> c.get("e") is None
True
>>> c.get("e", 0)
0
>>> c.update(e=2)
>>> c["e"]
2

class_var¶

获取所有设定的参数

返回:	parameters -- all variables used as parameters
返回类型:	dict

dump(cfg_path: str, override=True, file_format=None)[源代码]¶

将配置参数写入文件

Updated in version 1.3.16

参数:	cfg_path (str) -- override (bool) -- file_format (str) --

classmethod excluded_names()[源代码]¶

获取非参变量集

返回:	exclude names set -- 所有非参变量
返回类型:	set

classmethod load(cfg_path, file_format=None, **kwargs)[源代码]¶

从配置文件中装载配置类

Updated in version 1.3.16

classmethod load_cfg(cfg_path, file_format=None, **kwargs)[源代码]¶: 从配置文件中装载配置参数

parsable_var¶

获取可以进行命令行设定的参数

返回:	store_vars -- 可以进行命令行设定的参数
返回类型:	dict

class longling.lib.parser.ConfigurationParser(class_type, excluded_names: (<class 'set'>, None) = None, commands=None, *args, params_help=None, commands_help=None, override_help=False, **kwargs)[源代码]¶

Update in v1.3.18

配置文件解析类，可用于构建cli工具。该类首先读入所有目标配置文件类class_obj的所有类属性，解析后生成命令行。普通属性参数使用 "--att_name att_value" 来读入。另外提供一个额外参数标记 ‘--kwargs’ 来读入可选参数。可选参数格式为

--kwargs key1=value1;key2=value2;...

首先生成一个解析类

cli_parser = ConfigurationParser(Configuration)

除了解析已有配置文件外，解析类还可以进一步添加函数来生成子解析器

cli_parser = ConfigurationParser($function)

或者

cli_parser = ConfigurationParser([$function1, $function2])

用以下三种解析方式中任意一种来解析参数：

命令行模式
```
cli_parser()
```

字符串传参模式

cli_parser('$parameter1 $parameters ...')

列表传参模式

cli_parser(["--a", "int(1)", "--b", "int(2)"])

Notes

包含以下关键字的字符串会在解析过程中进行类型转换

int, float, dict, list, set, tuple, None

参数:	class_type -- 类。注意是类，不是类实例。 excluded_names -- 类中不进行解析的变量名集合 commands -- 待解析的命令函数

实际案例

>>> class TestC(Configuration):
...     a = 1
...     b = 2
>>> def test_f1(k=1):
...     return k
>>> def test_f2(h=1):
...      return h
>>> def test_f3(m):
...      return m
>>> parser = ConfigurationParser(TestC)
>>> parser("--a 1 --b 2")
{'a': '1', 'b': '2'}
>>> ConfigurationParser.get_cli_cfg(TestC)
{'a': 1, 'b': 2}
>>> parser(["--a", "1", "--b", "int(1)"])
{'a': '1', 'b': 1}
>>> parser(["--a", "1", "--b", "int(1)", "--kwargs", "c=int(3);d=None"])
{'a': '1', 'b': 1, 'c': 3, 'd': None}
>>> parser.add_command(test_f1, test_f2, test_f3)
>>> parser(["test_f1"])
{'a': 1, 'b': 2, 'k': 1, 'subcommand': 'test_f1'}
>>> parser(["test_f2"])
{'a': 1, 'b': 2, 'h': 1, 'subcommand': 'test_f2'}
>>> parser(["test_f3", "3"])
{'a': 1, 'b': 2, 'm': '3', 'subcommand': 'test_f3'}
>>> parser = ConfigurationParser(TestC, commands=[test_f1, test_f2])
>>> parser(["test_f1"])
{'a': 1, 'b': 2, 'k': 1, 'subcommand': 'test_f1'}
>>> class TestCC:
...     c = {"_c": 1, "_d": 0.1}
>>> parser = ConfigurationParser(TestCC)
>>> parser("--c _c=int(3);_d=float(0.3)")
{'c': {'_c': 3, '_d': 0.3}}
>>> class TestCls:
...     def a(self, a=1):
...         return a
...     @staticmethod
...     def b(b=2):
...         return b
...     @classmethod
...     def c(cls, c=3):
...         return c
>>> parser = ConfigurationParser(TestCls, commands=[TestCls.b, TestCls.c])
>>> parser("b")
{'b': 2, 'subcommand': 'b'}
>>> parser("c")
{'c': 3, 'subcommand': 'c'}

add_command(*commands, help_info: (typing.List[typing.Dict], <class 'dict'>, <class 'str'>, None) = None)[源代码]¶: 批量添加子命令解析器

static func_spec(f)[源代码]¶: 获取函数参数表

classmethod get_cli_cfg(params_class) → dict[源代码]¶: 获取默认配置参数

static parse(arguments)[源代码]¶: 参数后解析

class longling.lib.parser.Formatter(formatter: (<class 'str'>, None) = None)[源代码]¶

以特定格式格式化字符串

实际案例

>>> formatter = Formatter()
>>> formatter("hello world")
'hello world'
>>> formatter = Formatter("hello {}")
>>> formatter("world")
'hello world'
>>> formatter = Formatter("hello {} v{:.2f}")
>>> formatter("world", 0.2)
'hello world v0.20'
>>> formatter = Formatter("hello {1} v{0:.2f}")
>>> formatter(0.2, "world")
'hello world v0.20'
>>> Formatter.format(0.2, "world", formatter="hello {1} v{0:.3f}")
'hello world v0.200'

class longling.lib.parser.ParserGroup(parsers: dict, prog=None, usage=None, description=None, epilog=None, add_help=True)[源代码]¶

>>> class TestC(Configuration):
...     a = 1
...     b = 2
>>> def test_f1(k=1):
...     return k
>>> def test_f2(h=1):
...      return h
>>> class TestC2(Configuration):
...     c = 3
>>> parser1 = ConfigurationParser(TestC, commands=[test_f1])
>>> parser2 = ConfigurationParser(TestC, commands=[test_f2])
>>> pg = ParserGroup({"model1": parser1, "model2": parser2})
>>> pg(["model1", "test_f1"])
{'a': 1, 'b': 2, 'k': 1, 'subcommand': 'test_f1'}
>>> pg("model2 test_f2")
{'a': 1, 'b': 2, 'h': 1, 'subcommand': 'test_f2'}

longling.lib.parser.is_classmethod(method)[源代码]¶

参数:	method --

实际案例

>>> class A:
...     def a(self):
...         pass
...     @staticmethod
...     def b():
...         pass
...     @classmethod
...     def c(cls):
...         pass
>>> obj = A()
>>> is_classmethod(obj.a)
False
>>> is_classmethod(obj.b)
False
>>> is_classmethod(obj.c)
True
>>> def fun():
...     pass
>>> is_classmethod(fun)
False

path¶

longling.lib.path.path_append(path, *addition, to_str=False)[源代码]¶

路径合并函数

实际案例

path_append("../", "../data", "../dataset1/", "train", to_str=True) '../../data/../dataset1/train'

参数:	path (str or PurePath) -- addition (list(str or PurePath)) -- to_str (bool) -- Convert the new path to str

longling.lib.path.file_exist(filepath)[源代码]¶: 判断文件是否存在

longling.lib.path.abs_current_dir(filepath)[源代码]¶

获取文件所在目录的绝对路径

Example

longling.lib.path.type_from_name(filename)[源代码]¶

实际案例

>>> type_from_name("1.txt")
'.txt'

longling.lib.path.tmpfile(suffix=None, prefix=None, dir=None)[源代码]¶

Create a temporary file, which will automatically cleaned after used (outside "with" closure).

实际案例

progress¶

进度监视器，帮助用户知晓当前运行进度，主要适配于机器学习中分 epoch，batch 的情况。

和 tqdm 针对单个迭代对象进行快速适配不同， progress的目标是能将监视器不同功能部件模块化后再行组装，可以实现description的动态化，给用户提供更大的便利性。

MonitorPlayer 定义了如何显示进度和其它过程参数(better than tqdm, where only n is changed and description is fixed)
- 在 __call__ 方法中定义如何显示
继承ProgressMonitor并传入必要参数进行实例化
- 继承重写ProgressMonitor的__call__函数，用 IterableMIcing 包裹迭代器，这一步可以灵活定义迭代前后的操作
- 需要在__init__的时候传入一个MonitorPlayer实例
IterableMIcing 用来组装迭代器、监控器

一个简单的示例如下

class DemoMonitor(ProgressMonitor):
    def __call__(self, iterator):
        return IterableMIcing(
            iterator,
            self.player, self.player.set_length
        )

progress_monitor = DemoMonitor(MonitorPlayer())

for _ in range(5):
    for _ in progress_monitor(range(10000)):
        pass
    print()

cooperate with tqdm

from tqdm import tqdm

class DemoTqdmMonitor(ProgressMonitor):
    def __call__(self, iterator, **kwargs):
        return tqdm(iterator, **kwargs)

class longling.lib.progress.IterableMIcing(iterator: (typing.Iterable, <class 'list'>, <class 'tuple'>, <class 'dict'>), hook_in_iter=<function pass_function>, hook_after_iter=<function pass_function>, length: (<class 'int'>, None) = None)[源代码]¶

将迭代器包装为监控器可以使用的迭代类： * 添加计数器 count, 每迭代一次，count + 1, 迭代结束时，可根据 count 得知数据总长 * 每次 __iter__ 时会调用 call_in_iter 函数 * 迭代结束时，会调用 call_after_iter

参数:

iterator -- 待迭代数据
hook_in_iter -- 每次迭代中的回调函数（例如：打印进度等）,接受当前的 count 为输入
hook_after_iter -- 每轮迭代后的回调函数（所有数据遍历一遍后），接受当前的 length 为输入
length -- 数据总长（有多少条数据）
iterator = IterableMIcing(range(100)) (>>>) --
for i in iterator (>>>) --
pass (..) --
len(iterator) (>>>) --
100 --
def iter_fn(num) (>>>) --
for i in range(num) (..) --
yield num (..) --
iterator = IterableMIcing(iter_fn(50)) (>>>) --
for i in iterator --
pass --
len(iterator) --
50 --

class longling.lib.progress.MonitorPlayer[源代码]¶: 异步监控器显示器

class longling.lib.progress.AsyncMonitorPlayer(cache_size=10000)[源代码]¶: 异步监控器显示器

regex¶

longling.lib.regex.variable_replace(string: str, key_lower: bool = True, quotation: str = '', **variables)[源代码]¶

实际案例

>>> string = "hello $who"
>>> variable_replace(string, who="world")
'hello world'
>>> string = "hello $WHO"
>>> variable_replace(string, key_lower=False, WHO="world")
'hello world'
>>> string = "hello $WHO"
>>> variable_replace(string, who="longling")
'hello longling'
>>> string = "hello $Wh_o"
>>> variable_replace(string, wh_o="longling")
'hello longling'

longling.lib.regex.default_variable_replace(string: str, default_value: (<class 'str'>, None, <class 'dict'>) = None, key_lower: bool = True, quotation: str = '', **variables) → str[源代码]¶

实际案例

>>> string = "hello $who, I am $author"
>>> default_variable_replace(string, default_value={"author": "groot"}, who="world")
'hello world, I am groot'
>>> string = "hello $who, I am $author"
>>> default_variable_replace(string, default_value={"author": "groot"})
'hello , I am groot'
>>> string = "hello $who, I am $author"
>>> default_variable_replace(string, default_value='', who="world")
'hello world, I am '
>>> string = "hello $who, I am $author"
>>> default_variable_replace(string, default_value=None, who="world")
'hello world, I am $author'

stream¶

此模块用以进行流处理

longling.lib.stream.to_io(stream: (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, (<class 'str'>, <class 'pathlib.PurePath'>), <class 'list'>, None) = None, mode='r', encoding='utf-8', **kwargs)[源代码]¶

Convert an object as an io stream, could be a path to file or an io stream.

实际案例

to_io("demo.txt")  # equal to open("demo.txt")
to_io(open("demo.txt"))  # equal to open("demo.txt")
a = to_io()  # equal to a = sys.stdin
b = to_io(mode="w)  # equal to a = sys.stdout

longling.lib.stream.as_io(src: (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, (<class 'str'>, <class 'pathlib.PurePath'>), <class 'list'>, None) = None, mode='r', encoding='utf-8', **kwargs)[源代码]¶

with wrapper for to_io function, default mode is "r"

实际案例

with as_io("demo.txt") as f:
    for line in f:
        pass

# equal to
with open(demo.txt) as src:
    with as_io(src) as f:
        for line in f:
            pass

# from several files
with as_io(["demo1.txt", "demo2.txt"]) as f:
    for line in f:
        pass

# from sys.stdin
with as_io() as f:
    for line in f:
        pass

longling.lib.stream.as_out_io(tar: (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, (<class 'str'>, <class 'pathlib.PurePath'>), <class 'list'>, None) = None, mode='w', encoding='utf-8', **kwargs)[源代码]¶

with wrapper for to_io function, default mode is "w"

实际案例

with as_out_io("demo.txt") as wf:
    print("hello world", file=wf)

# equal to
with open(demo.txt) as tar:
    with as_out_io(tar) as f:
        print("hello world", file=wf)

# to sys.stdout
with as_out_io() as wf:
    print("hello world", file=wf)

# to sys.stderr
with as_out_io(mode="stderr) as wf:
    print("hello world", file=wf)

longling.lib.stream.wf_open(stream_name: (((<class 'str'>, <class 'pathlib.PurePath'>), (<class '_io.TextIOWrapper'>, <class 'typing.TextIO'>, <class 'typing.BinaryIO'>, <class 'codecs.StreamReaderWriter'>, <class 'fileinput.FileInput'>)), None) = None, mode='w', encoding='utf-8', **kwargs)[源代码]¶

Simple wrapper to codecs for writing.

stream_name为空时 mode - w 返回标准错误输出 stderr; 否则，返回标准输出 stdout

stream_name不为空时，返回文件流

参数:	stream_name (str, PurePath or None) -- mode (str) -- encoding (str) -- 编码方式，默认为 utf-8
返回:	write_stream -- 返回打开的流
返回类型:	StreamReaderWriter

实际案例

>>> wf = wf_open(mode="stdout")
>>> print("hello world", file=wf)
hello world

longling.lib.stream.close_io(stream)[源代码]¶: 关闭文件流，忽略 sys.stdin, sys.stdout, sys.stderr

longling.lib.stream.flush_print(*values, **kwargs)[源代码]¶: 刷新打印函数

longling.lib.stream.build_dir(path, mode=509, parse_dir=True)[源代码]¶

创建目录，从path中解析出目录路径，如果目录不存在，创建目录

参数:	path (str) -- mode (int) -- parse_dir (bool) --

class longling.lib.stream.AddPrinter(fp, values_wrapper=<function AddPrinter.<lambda>>, to_io_params=None, ensure_io=False, **kwargs)[源代码]¶

以add方法添加文件内容的打印器

实际案例

>>> import sys
>>> printer = AddPrinter(sys.stdout, ensure_io=True)
>>> printer.add("hello world")
hello world

exception longling.lib.stream.StreamError[源代码]¶

longling.lib.stream.check_file(filepath, size=None)[源代码]¶

检查文件是否存在，size给定时，检查文件大小是否一致

参数:	filepath (str) -- size (int) --
返回:	file exist or not
返回类型:	bool

longling.lib.stream.encode(src, src_encoding, tar, tar_encoding)[源代码]¶

Convert a file in source encoding to target encoding

参数:	src -- src_encoding -- tar -- tar_encoding --

longling.lib.stream.block_std()[源代码]¶

实际案例

>>> print("hello world")
hello world
>>> with block_std():
...     print("hello world")

structure¶

class longling.lib.structure.AttrDict(*args, **kwargs)[源代码]¶

Example

>>> ad = AttrDict({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
>>> ad
{'first_name': 'Eduardo', 'last_name': 'Pool', 'age': 24, 'sports': ['Soccer']}
>>> ad.first_name
'Eduardo'
>>> ad.age
24
>>> ad.age = 16
>>> ad.age
16
>>> ad["age"] = 20
>>> ad["age"]
20

class longling.lib.structure.SortedList(iterable: Iterable[T_co] = (), key=None)[源代码]¶

A list maintaining the element in an ascending order.

A custom key function can be supplied to customize the sort order.

实际案例

>>> sl = SortedList()
>>> sl.adds(*[1, 2, 3, 4, 5])
>>> sl
[1, 2, 3, 4, 5]
>>> sl.add(7)
>>> sl
[1, 2, 3, 4, 5, 7]
>>> sl.add(6)
>>> sl
[1, 2, 3, 4, 5, 6, 7]
>>> sl = SortedList([4])
>>> sl.add(3)
>>> sl.add(2)
>>> sl
[2, 3, 4]
>>> list(reversed(sl))
[4, 3, 2]
>>> sl = SortedList([("harry", 1), ("tom", 0)], key=lambda x: x[1])
>>> sl
[('tom', 0), ('harry', 1)]
>>> sl.add(("jack", -1), key=lambda x: x[1])
>>> sl
[('jack', -1), ('tom', 0), ('harry', 1)]
>>> sl.add(("ada", 2))
>>> sl
[('jack', -1), ('tom', 0), ('harry', 1), ('ada', 2)]

longling.lib.structure.nested_update(src: dict, update: dict)[源代码]¶

实际案例

>>> nested_update({"a": {"x": 1}}, {"a": {"y": 2}})
{'a': {'x': 1, 'y': 2}}
>>> nested_update({"a": {"x": 1}}, {"a": {"x": 2}})
{'a': {'x': 2}}
>>> nested_update({"a": {"x": 1}}, {"b": {"y": 2}})
{'a': {'x': 1}, 'b': {'y': 2}}
>>> nested_update({"a": {"x": 1}}, {"a": 2})
{'a': 2}

testing¶

longling.lib.testing.simulate_stdin(*inputs)[源代码]¶

测试中模拟标准输入

参数:	inputs (list of str) --

实际案例

>>> with simulate_stdin("12", "", "34"):
...     a = input()
...     b = input()
...     c = input()
>>> a
'12'
>>> b
''
>>> c
'34'

time¶

longling.lib.time.get_current_timestamp() → str[源代码]¶

实际案例

> get_current_timestamp()
'20200327172235'

utilog¶

日志设定文件

longling.lib.utilog.config_logging(filename=None, log_format=None, level=20, logger=None, console_log_level=None, propagate=False, mode='a', file_format=None, encoding: (<class 'str'>, None) = 'utf-8', enable_colored=False, datefmt=None)[源代码]¶

主日志设定文件

参数:

filename (str or None) -- 日志存储文件名，不为空时将创建文件存储日志
log_format (str) -- 默认日志输出格式: %(name)s, %(levelname)s %(message)s 如果 datefmt 被指定，则变为 %(name)s: %(asctime)s, %(levelname)s %(message)s
level (str or int) -- 默认日志等级
logger (str or logging.logger) -- 日志logger名，可以为空（使用root logger），字符串类型（创建对应名logger），logger
console_log_level (str, int or None) -- 屏幕日志等级，不为空时，使能屏幕日志输出
propagate (bool) --
mode (str) --
file_format (str or None) -- 文件日志输出格式，为空时，使用log_format
encoding --
enable_colored (bool) --
datefmt (str) --

longling.lib.utilog.default_timestamp() → str¶

实际案例

> get_current_timestamp()
'20200327172235'

yaml¶

class longling.lib.yaml_helper.FoldedString[源代码]¶

longling.lib.yaml_helper.dump_folded_yaml(yaml_string)[源代码]¶: specially designed for arch module, should not be used in other places

longling.lib.yaml_helper.ordered_yaml_load(stream, Loader=<class 'yaml.loader.Loader'>, object_pairs_hook=<class 'collections.OrderedDict'>)[源代码]¶

实际案例

ordered_yaml_load("path_to_file.yaml")
OrderedDict({"a":123})