0%

From ChatGPT:

In general, when a program enters a signal handler in a multi-threaded environment, the behavior regarding other threads depends on how the signal handler is set up and the specific signal that is being handled.

  1. Default Behavior: By default, when a signal is delivered to a process, it interrupt the thread that is currently running and executes the signal handler in the context of that thread. Other threads in the process continue running unless they are also interrupted by signals.
  2. Thread-Specific Signal Handling: Some signals, such as SIGINT (interrupt signal), SIGTERM (termination signal), or SIGABRT (abort signal), are typically delivered to the entire process, which means they can interrupt any thread. However, other signals, like SIGSEGV (segmentation fault) or SIGILL (illegal signal), are usually delivered to the specific thread that caused the signal.
  3. Signal Masking: In a multi-threaded program, you can use signal masking (sigprocmask in POSIX systems) to block certain signals in specific threads. This can affect whether a signal handler interrupts a particular thread or not.
  4. Asynchronous-Signal-Safe Functions: Signal handlers should only execute functions are considered “asynchronous-signal-safe” according to POSIX standards. These functions are designed to be safe to call from within a signal handler. Using non-safe functions in a signal handler can lead to undefined behavior.

Control the number threads

  • Method 1: Use the environment variable TBB_NUM_THREADS for the gloabl setting.
1
export TBB_NUM_THREADS=4

TODO: It doesn’t seem to work!

  • Method 2: Use tbb::task_arena or tbb::task_scheduler_init (Deprecated).

TBB will use this setting locally within the scope of the tbb::task_arena.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <tbb/pipeline.h>
// Deprecated:
// #include <tbb/task_scheduler_init.h>
#include <tbb/task_arena.h>

// Define your pipeline body
class MyPipeline {
public:
void operator() (tbb::flow_control& fc) const {
// Your pipeline logic here
// ...
// Inform the pipeline that there is no more data
fc.stop();
}
};

int main() {
// Deprecated: tbb::task_scheduler_init init(1);
tbb::task_arena arena(4); // 4 threads
// Do some tasks:
tbb::parallel_pipeline(/* max_number_of_live_tokens */ 4, MyPipeline);

return 0;
}

gcc

gcc是一个编译套件,包含c、c++、Fortran语言的编译器。

glibc

glibc是一个library,为C程序提供基础公共功能,包括系统调用、数学函数和其他核心组件。
Linux平台和vscode似乎都依赖glibc,如果擅自将LD_LIBRARY_PATH更改为其他版本的glibc路径,则bash会直接crash。

glibc包含以下bin和lib:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$ cd glibc-v2.34/Linux/RHEL7.0-2017-x86_64/bin && ls
catchsegv getconf iconv locale makedb pcprofiledump sotruss tzselect zdump
gencat getent ldd localedef mtrace pldd sprof xtrace

# 进入其他版本的glibc/lib目录执行ls命令会报错,大概原因可能是因为当前路径的glibc的lib和系统的lib冲突。
$ cd ../lib && ls
ls: relocation error: ./libc.so.6: symbol __tunable_get_val, version GLIBC_PRIVATE not defined in file ld-linux-x86-64.so.2 with link time reference

$ cd .. && ls lib
Mcrt1.o libanl.so.1 libm.so libnss_hesiod.so.2
Scrt1.o libc.a libm.so.6 libpcprofile.so
audit libc.so libmcheck.a libpthread.a
crt1.o libc.so.6 libmemusage.so libpthread.so.0
crti.o libc_malloc_debug.so libmvec.a libresolv.a
crtn.o libc_malloc_debug.so.0 libmvec.so libresolv.so
gconv libc_nonshared.a libmvec.so.1 libresolv.so.2
gcrt1.o libcrypt.a libnsl.so.1 librt.a
ld-linux-x86-64.so.2 libcrypt.so libnss_compat.so librt.so.1
libBrokenLocale.a libcrypt.so.1 libnss_compat.so.2 libthread_db.so
libBrokenLocale.so libdl.a libnss_db.so libthread_db.so.1
libBrokenLocale.so.1 libdl.so.2 libnss_db.so.2 libutil.a
libSegFault.so libg.a libnss_dns.so.2 libutil.so.1
libanl.a libm-2.34.a libnss_files.so.2
libanl.so libm.a libnss_hesiod.so

查看glibc的版本:

1
2
# 从上可知,ldd是glibc的核心组件之一
$ ldd --version

寻找libc.so的路径:

1
2
3
4
5
6
7
8
$ locate libc.so
/usr/lib/x86_64-linux-gnu/libc.so
/usr/lib/x86_64-linux-gnu/libc.so.6
$ locate libstdc++.so
/usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30
/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30-gdb.py

安装glibc:

Ubuntu平台

1
sudo apt-get install lib6

RedHat平台

1
sudo yum install glibc

检查GNC C++ Library (libstdc++)的版本:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ strings /usr/lib/libstdc++.so.* | grep LIBCXX
[sjcvl-zhigaoz ] /lan/cva_rel/vxe_main/24.02.650.d000/tools.lnx86/lib/64bit % strings /usr/lib/libstdc++.so.* | grep LIBCXX
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
...
GLIBCXX_3.4.19
GLIBCXX_DEBUG_MESSAGE_LENGTH

$ strings /usr/lib/libc.so.* | grep GLIBC
GLIBC_2.0
GLIBC_2.1
GLIBC_2.1.1
...
GLIBC_2.17
GLIBC_PRIVATE

如果你有一个使用了libstdc++的特定的binary或application,可以用下面的命令来检查其版本:

1
$ ldd <your_binary_or_application> | grep libstdc++

使用vscode的“Remote SSH”工具试图连接到Linux时,可能会报错如下:

Warning: Missing GLIBCXX >= 3.4.25! from /usr/lib64/libstdc++.so.6.0.19
Warning: Missing GLIBC >= 2.28! from /usr/lib64/libc-2.17.so
Error: Missing required dependencies. Please refer to our FAQ https://aka.ms/vscode-remote/faq/old-linux for additional information.

这是因为Linux系统上的glibc版本中不包含GLIBCXX_3.4.25及以上的版本。此时需要降级vscode(建议做法)或升级glibc(似乎很难)。

times

  1. bash built-in
1
times
  1. function
1
2
3
#include <sys/times.h>

clock_t times(struct tms *buf);

malloc/free

See this example

1
char ** backtrace_symbols (void *const *buffer, int size) 

The return value of backtrace_symbols is a pointer obtained via the malloc function, and it is the responsibility of the caller to free that pointer. Note that only the return value need be freed, not the individual strings.

Question: Why does it say “only the return value need be freed, not the individual strings”?

Let us observe the defintion of the malloc/free functions first:

1
2
void *malloc( size_t size );
void free( void *ptr );

free takes a void* pointer to deallocate the memory, it doesn’t care what type it is, even if it is a multi-level pointer. It means that malloc has stored the memory size in some place and free will find it beforing deallocate the memory.

Let us return the question. The memory pointer returned by backtrace_symbols is the char** type, it must be a whole block contigunous memory using malloc and might be enforced to be transformed as char** pointer when returing. So when we free the memory block, the Linux kernel find its actual memory size and deallocate it.

Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <string.h>
#include <stdio.h>
#include <stdlib.h>

int main() {
char** strings = (char**)malloc(3 * sizeof(char*) + 3 * 50); // assuming a maximum 50 characters per sentence
char* block = (char*)(strings + 3);
char* s1 = strcpy(block, "The first sentence"); block += strlen(s1) + 1;
char* s2 = strcpy(block, "The second sentence"); block += strlen(s2) + 1;
char* s3 = strcpy(block, "The third sentence");
strings[0] = s1;
strings[1] = s2;
strings[2] = s3;
for(int i = 0; i < 3; ++i) {
printf("%s\n", strings[i]);
}
free(strings); // deallocate all memory at once

return 0;
}

More elegant but less economical code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <string.h>
#include <stdio.h>
#include <stdlib.h>

int main() {
char** strings = (char**)malloc(3 * sizeof(char*) + 3 * 50);
char* block = (char*)(strings + 3);
for(int i = 0; i < 3; ++i) {
strings[i] = block + i * 50; // Assuming a maximum of 50 characters per sentence
}
strcpy(strings[0], "The first sentence");
strcpy(strings[1], "The second sentence");
strcpy(strings[2], "The third sentence");

for(int i = 0; i < 3; ++i) {
printf("%s\n", strings[i]);
}

free(strings); // deallocate all memory at once

return 0;
}

Reference

shuf

cut

tr

lp

sort

Options:

-t, --field-separator=SEP
    use SEP instead of non-blank to blank transition

-k, --key=POS1[,POS2]
    start a key at POS1 (origin 1), end it at POS2 (default end of line)

-h, --human-numeric-sort
    compare human readable numbers (e.g., 2K 1G)

-n, --numeric-sort
    compare according to string numerical value

nproc

print the number of processing units avaiable.

od / xxd / hexdump

read the binary file.

Notes: byte order

1
2
3
4
5
$ echo -n "ABCD" | xxd
00000000: 4142 4344 ABCD
$ echo -n "ABCD" | hexdump
0000000 4241 4443
0000004

Reference

comm / diff / tkdiff / cmp

Can be used to compare binary or non-binary files.

comm

compare two sorted files line by line.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ cat file1.txt 
apple
banana
cherry

$ cat file2.txt
banana
cherry
date
erase

$ comm file1.txt file2.txt
apple
banana
cherry
date
erase

The file must be sorted before using the comm command. Otherwise it will complain that:

comm: file 1 is not in sorted order

and cannot work correctly. For example,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ cat file1.txt 
apple
cherry
banana

$ cat file2.txt
banana
cherry
date
erase

$ comm file1.txt file2.txt
apple
banana
cherry
comm: file 1 is not in sorted order
banana
date
erase
comm: input is not in sorted order

diff

Syntax:

diff -u file1 file2

Options:

-e, --ed
    output an ed script

-u, -U NUM, --unified[=NUM]
    output NUM (default 3) lines of unified context
    (that is, print NUM lines before and after the difference line)

tkdiff

Use a GUI to display the differences.

cmp

Prints less information comparing to diff.

Syntax:

cmp file1 file2

ed/vim/sed/awk

列表初始化

struct/union/array默认支持列表初始化。

Struct and union initialization
Array initialization

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <bits/stdc++.h>
// #include <initializer_list>
using namespace std;

class A {
public:
int x, y, z; // 注意:数据成员必须是public的。
void print() {
cout << x << " " << y << " " << z << endl;
}
};

class B {
public:
vector<int> vec_;
int x_;
void print() {
cout << "vec_={ ";
for (int x : vec_) {
cout << x << " ";
}
cout << "}, x_=" << x_ << endl;
}
};

class C {
public:
vector<int> vec_;
void print() {
cout << "vec_={ ";
for (int x : vec_) {
cout << x << " ";
}
cout << "}" << endl;
}
};

int main() {
// C++会构造一个列表初始化的默认构造函数,
// 以下(1)和(2)都是调用这个默认构造函数。
A a1{1,2,3}; // (1) 列表初始化
A a2({4,5,6}); // (2) 同(1)
a1.print();
a2.print();

B b1{{1,2,3}, 4}; // 内部的"{1,2,3}"用于构造vec_,"4"用于初始化x_
b1.print();

// C C1{1,2,3}; // (1) 这里会报错"error: too many initializers for ‘C’",
// // 因为C只有一个数据成员vec_,这里却传入了3个参数
C c2{{1,2,3}}; // (2) 其中,内部的"{1,2,3}"用于构造vec_,外层的"{}"用于对c2本身进行构造
c2.print();

return 0;
}

std::initializer_list

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <bits/stdc++.h>
#include <initializer_list>
using namespace std;

struct A {
int x, y, z;

A(initializer_list<int> il) {
initializer_list<int>::iterator it = il.begin();
x = *it++;
y = *it++;
z = *it++;
}

void print() {
cout << x << " " << y << " " << z << endl;
}
};

struct B {
vector<int> vec;

B(initializer_list<int> il) {
vec = il; // 用initializer_list初始化vector
}

void print() {
cout << "{ ";
for (int x : vec) {
cout << x << " ";
}
cout << "}" << endl;
}
};

int main() {
A a{1,2,3};
a.print();

B b{4,5,6};
b.print();

return 0;
}

中括号

  1. [ ]test是bash的内部命令,[[ ]]是shell的条件判断关键字。

    1
    2
    3
    4
    5
    6
    $ type [
    [ is a shell builtin
    $ type test
    test is a shell builtin
    $ type [[
    [[ is a shell keyword
  2. [ ]test是等价的,用于评估条件表达式。可以使用man [help [查阅帮助文档。

    1
    2
    3
    4
    5
    6
    $ help [
    [: [ arg... ]
    Evaluate conditional expression.

    This is a synonym for the "test" builtin, but the last argument must
    be a literal `]', to match the opening `['.
  3. [[ ]]关键字可以屏蔽shell特殊符号,比如&&||><可以被认为是条件判断符而不是重定向符。

  4. [ ]中使用-a-o表示逻辑与和逻辑或,[[ ]]中则使用&&||

小括号

  1. $()用于命令替换。
  2. 双小括号(( )):在比较过程中使用高级数学表达式。

大括号

请阅读:All about {Curly Braces} in Bash

  1. ${}用于引用变量。

    $var相比,${var}是一种消除歧义的措施,比如:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    $ var=abc
    $ vartest=ABC
    # $var引用变量'var'
    $ echo $var
    abc
    # 引用变量'vartest'
    $ echo $vartest
    ABC
    # 引用变量'var'并在其后加上'test'字符
    $ echo ${var}test
    abctest
  2. {}表示分组。

Reference

regexp

ERegExp

Wildcard/Glob

man 7 glob

glob - globbing pathnames. glob is a shell built-in.

主要用于匹配带有通配符的文件路径。其匹配字符串的能力比正则表达式弱。

它最初是贝尔实验室 Unix 系统上的一个名叫 glob 的命令(glob 是 global 的缩写),用于展开命令行中的通配符。后来系统提供了该功能的 C 语言库函数glob(),知名的 shell 解释器就使用了该接口,shell 脚本和命令行中使用的 glob 模式匹配功能便源自于此。——见博客

Wildcards

{}严格来讲不属于glob的范畴,其在shell表示一个分组,见:All about {Curly Braces} in Bash

sed

awk

awk Command

格式

1
awk -F' *|:' '/LISTEN/{print $2}'

其中,-F表示分隔符;
*|:是一个正则表达式,表示以”一个或多个空格”或”:”作为分隔符;
再其后的//中是另一个正则表达式,用于匹配文本;
{}中是action。

条件判断

if-else in awk
use AND and OR in an awk program

1
awk '{if ($1 > 49151 && $1 < 65536) {print $1} }'

等价于

1
awk '$1 > 49151 && $1 < 65536'

BEGIN/END

In AWK, BEGIN and END are special patterns that allow you to execute code before processing the input (BEGIN) or after processing all the input (END).

  • The BEGIN block is executed once at the beginning of the AWK program and it is typically used for initializing variables or performing setup tasks.
  • The END block is executed once after processing all input, and it is commonly used for final caclucations, summaries, or printing results.

Special symbols

  • $ 用于引用field,例如$1代表第一个field(典型来说是第一列)。
  • NF 表示number of filed,假设一共有7列,那么$NF$7等价。

grep

Example

  1. 找出一个未使用的port
1
2
3
4
5
6
7
# "$$"是为了转义"$"
max_port=$(shell netstat -antulpen 2> /dev/null \
| awk -F' *' '/^(tcp|udp)/{print $$4}' | cut -d: -f 2 \
| egrep "\w" | sort | tail -1)
#$(warning ${max_port})
port=$(shell expr $(max_port) + 1)
#$(warning ${port})
1
2
3
4
5
6
# 从1025开始找出已经存在的端口,如果相邻端口的gap大于1,则返回“当前端口号+1”
netstat -ant | awk '{print $4}' \
| awk -F: '{if ($NF ~ /^[0-9]+$/ && $NF > 1024) {print $NF}}' \
| awk 'BEGIN {prev = 1024}
{if ($1 - prev > 1) { port = prev + 1; exit} else { prev = $1}}
END { if (port=="") {print prev + 1} else {print port}}'
1
2
3
4
5
6
7
8
9
10
11
12
#这不是正则表达式
#以0端口为参数创建一个socket,则系统会自动选择一个未使用的端口号

#one-line mode:
#python -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()'
import socket

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('', 0))
addr = s.getsockname()
print(addr[1])
s.close()