Commit graph

2389 commits

Author SHA1 Message Date
Konstantin Pavlov
700ee28bbf Packages: check rpm database for actual provides.
Previously, we required an exact non-virtual package, however it's fine
if some package has a fully-virtual provides for what we need.
2023-03-22 16:55:25 -07:00
Alejandro Colomar
6e16d7ac5b Auto: mirroring installation structure in build tree.
This makes the build tree more organized, which is good for adding new
stuff.  Now, it's useful for example for adding manual pages in man3/,
but it may be useful in the future for example for extending the build
system to run linters (e.g., clang-tidy(1), Clang analyzer, ...) on the
C source code.

Previously, the build tree was quite flat, and looked like this (after
`./configure && make`):

    $ tree -I src build
    build
    ├── Makefile
    ├── autoconf.data
    ├── autoconf.err
    ├── echo
    ├── libnxt.a
    ├── nxt_auto_config.h
    ├── nxt_version.h
    ├── unitd
    └── unitd.8

    1 directory, 9 files

And after this patch, it looks like this:

    $ tree -I src build
    build
    ├── Makefile
    ├── autoconf.data
    ├── autoconf.err
    ├── bin
    │   └── echo
    ├── include
    │   ├── nxt_auto_config.h
    │   └── nxt_version.h
    ├── lib
    │   ├── libnxt.a
    │   └── unit
    │       └── modules
    ├── sbin
    │   └── unitd
    ├── share
    │   └── man
    │       └── man8
    │           └── unitd.8
    └── var
        ├── lib
        │   └── unit
        ├── log
        │   └── unit
        └── run
            └── unit

    17 directories, 9 files

It also solves one issue introduced in
5a37171f73 ("Added default values for pathnames.").  Before that
commit, it was possible to run unitd from the build system
(`./build/unitd`).  Now, since it expects files in a very specific
location, that has been broken.  By having a directory structure that
mirrors the installation, it's possible to trick it to believe it's
installed, and run it from there:

    $ ./configure --prefix=./build
    $ make
    $ ./build/sbin/unitd

Fixes: 5a37171f73 ("Added default values for pathnames.")
Reported-by: Liam Crilly <liam@nginx.com>
Reviewed-by: Konstantin Pavlov <thresh@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Cc: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-03-29 00:41:08 +02:00
Alejandro Colomar
5ba79b9b52 Renamed --libstatedir to --statedir.
In BSD systems, it's usually </var/db> or some other dir under </var>
that is not </var/lib>, so $statedir is a more generic name.  See
hier(7).

Reported-by: Andrei Zeliankou <zelenkov@nginx.com>
Reported-by: Zhidao Hong <z.hong@f5.com>
Reviewed-by: Konstantin Pavlov <thresh@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Liam Crilly <liam@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-03-29 00:40:40 +02:00
Konstantin Pavlov
e242b1454d Tests: relaxed jar glob.
We install jars with names like websocket-api-${NXT_JAVA_MODULE}-$NXT_VERSION.jar,
which translates to versioned NXT_JAVA_MODULE in the packaging system, e.g.
websocket-api-java11-1.30.0.jar.
2023-03-27 13:16:47 -07:00
Andrei Zeliankou
d5382aebb7 Unit 1.29.1 release. 2023-02-28 16:18:19 +00:00
Andrei Zeliankou
0af1253c17 Generated Dockerfiles for Unit 1.29.1. 2023-02-28 14:52:32 +00:00
Andrei Zeliankou
32f9c3d6f9 Added version 1.29.1 CHANGES. 2023-02-28 16:16:27 +00:00
Andrei Zeliankou
8295a0eb08 Changes moved to the correct section. 2023-02-28 16:01:54 +00:00
Andrei Zeliankou
ca9988171b Added missing fixes in changes.xml. 2023-02-28 13:09:06 +00:00
Konstantin Pavlov
f4298180eb contrib: updated njs to 0.7.10. 2023-02-27 15:45:04 -08:00
Andrew Clayton
29471c8d32 Set a safer umask(2) when running as a daemon.
When running as a daemon. unit currently sets umask(0), i.e no umask.
This is resulting in various directories being created with a mode of
0777, e.g

  rwxrwxrwx

this is currently affecting cgroup and rootfs directories, which are
being created with a mode of 0777, and when running as a daemon as there
is no umask to restrict the permissions.

This also affects the language modules (the umask is inherited over
fork(2)) whereby unless something explicitly sets a umask, files and
directories will be created with full permissions, 0666 (rw-rw-rw-)/
0777 (rwxrwxrwx) respectively.

This could be an unwitting security issue.

My original idea was to just remove the umask(0) call and thus inherit
the umask from the executing shell/program.

However there was some concern about just inheriting whatever umask was
in effect.

Alex suggested that rather than simply removing the umask(0) call we
change it to a value of 022 (which is a common default), which will
result in directories and files with permissions at most of 0755
(rwxr-xr-x) & 0644 (rw-r--r--).

If applications need some other umask set, they can (as they always have
been able to) set their own umask(2).

Suggested-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Liam Crilly <liam@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-23 12:01:14 +00:00
Andrew Clayton
5c9113ddac Isolation: rootfs: Set the sticky bit on the tmp directory.
When using the 'rootfs' isolation option, by default a tmpfs filesystem
is mounted on tmp/. Currently this is mounted with a mode of 0777, i.e

  drwxrwxrwx.   3 root   root   60 Feb 22 11:56 tmp

however this should really have the sticky bit[0] set (as is per-normal for
such directories) to prevent users from having free reign on the files
contained within.

What we really want is it mounted with a mode of 01777, i.e

  drwxrwxrwt.   3 root   root   60 Feb 22 11:57 tmp

[0]: To quote inode(7)

 "The sticky bit (S_ISVTX) on a directory means that a file in that
  directory can be renamed or deleted only by the owner of the file, by
  the owner of the directory, and by a privileged process."

Reviewed-by: Liam Crilly <liam@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-22 16:04:53 +00:00
Andrei Zeliankou
1b7cf1f3d0 Tests: added Python tests with encoding. 2023-02-21 15:35:38 +00:00
Konstantin Pavlov
565dcbb139 Docker: bumped language versions. 2023-02-13 17:04:24 -08:00
Konstantin Pavlov
2a597c5c7a Docker: limited the waiting time for control socket creation.
While at it, fixed a typo.
2023-02-13 17:04:24 -08:00
Konstantin Pavlov
054518bb36 Docker: made dockerfiles use a single stage build process. 2023-02-13 17:04:24 -08:00
Konstantin Pavlov
eb47174875 Docker: added a target to generate Docker library definition. 2023-02-13 17:04:24 -08:00
Konstantin Pavlov
a8f21079e0 Docker: cleanup unused targets. 2023-02-13 17:04:24 -08:00
Andrei Zeliankou
b24257c8f6 Tests: added tests for the "log_route" option. 2023-03-21 19:20:23 +00:00
Alejandro Colomar
0ebce31c92 HTTP: added route logging.
-  Configuration: added "/config/settings/http/log_route".

   Type: bool
   Default: false

   This adds configurability to the error log.  It allows enabling and
   disabling logs related to how the router performs selection of the
   routes.

-  HTTP: logging request line.

   Log level: [notice]

   The request line is essential to understand which logs correspond to
   which request when reading the logs.

-  HTTP: logging route that's been discarded.

   Log level: [info]

-  HTTP: logging route whose action is selected.

   Log level: [notice]

-  HTTP: logging when "fallback" action is taken.

   Log level: [notice]

Closes: <https://github.com/nginx/unit/issues/758>
Link: <https://github.com/nginx/unit/pull/824>
Link: <https://github.com/nginx/unit/pull/839>
Suggested-by: Timo Stark <t.stark@nginx.com>
Suggested-by: Mark L Wood-Patrick <mwoodpatrick@gmail.com>
Suggested-by: Liam Crilly <liam@nginx.com>
Tested-by: Liam Crilly <liam@nginx.com>
Acked-by: Artem Konev <a.konev@f5.com>
Cc: Andrew Clayton <a.clayton@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-03-21 13:02:38 +01:00
Alejandro Colomar
773c341d70 HTTP: rewrote while loop as for loop.
This considerably simplifies the function, and will also help log the
iteration in which we are, which corresponds to the route array element.

Link: <https://github.com/nginx/unit/pull/824>
Link: <https://github.com/nginx/unit/pull/839>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Tested-by: Liam Crilly <liam@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-03-21 13:02:38 +01:00
Andrew Clayton
c18dd1f65b Default PR_SET_NO_NEW_PRIVS to off.
This prctl(2) option was enabled in commit 0277d8f1 ("Isolation: Fix the
enablement of PR_SET_NO_NEW_PRIVS.") and this was being set by default.

This prctl(2) when enabled renders (amongst other things) the set-UID
and set-GID bits on executables ineffective after an execve(2).

This causes an issue for applications that want to execute the
sendmail(8) binary, this includes the PHP mail() function, which is
usually set-GID.

After some internal discussion it was decided to disable this option by
default.

Closes: <https://github.com/nginx/unit/issues/852>
Fixes: 0277d8f1 ("Isolation: Fix the enablement of PR_SET_NO_NEW_PRIVS.")
Fixes: e2b53e16 ("Added "rootfs" feature.")
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:28:46 +00:00
Andrew Clayton
7d0ceb82c7 Improve an error message regarding Unix domain sockets.
When starting unit, if its Unix domain control socket was already active
you would get an error message like

  2023/03/15 18:07:55 [alert] 53875#8669650 connect(5, unix:/tmp/control.sock) succeed, address already in use

which is confusing in a couple of regards, firstly we have the classic
success/failure message and secondly 'address already in use' is an
actual errno value, EADDRINUSE and we didn't get an error from this
connect(2).

Re-word this error message for greater clarity.

Reported-by: Liam Crilly <liam.crilly@nginx.com>
Cc: Liam Crilly <liam.crilly@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:28:46 +00:00
Andrew Clayton
2e3e1c7e7b Socket: Remove Unix domain listen sockets upon reconfigure.
Currently when using Unix domain sockets for requests, if unit is
reconfigured then it will fail if it tries to bind(2) again to a Unix
domain socket with something like

  2023/02/25 19:15:50 [alert] 35274#35274 bind(\"unix:/tmp/unit.sock\") failed (98: Address already in use)

When closing such a socket we really need to unlink(2) it. However that
presents a problem in that when running as root, while the main process
runs as root and creates the socket, it's the router process, that runs
as an unprivileged user, e.g nobody, that closes the socket and would
thus remove it, but couldn't due to not having permission, even if the
socket is mode 0666, you need write permissions on the containing
directory to remove a file.

There are several options to solve this, all with varying degrees of
complexity and utility.

  1) Give the user who the router process runs as write permission to
     the directory containing the listen sockets. These can then be
     unlink(2)'d from the router process.

     Simple and would work, but perhaps not the most elegant.

  2) Using capabilities(7). The router process could temporarily attain
     the CAP_DAC_OVERRIDE capability, unlink(7) the socket, then
     relinquish the capability until required again.

     These are Linux specific (other systems may have similar mechanisms
     which would be extra work to support). There is also a, albeit
     small, window where the router process is running with elevated
     privileges.

  3) Have the main process do the unlink(2), it is after all the process
     that created the socket.

     This is what this commit implements.

We create a new port IPC message type of NXT_PORT_MSG_SOCKET_UNLINK,
that is used by the router process to notify the main process about a
Unix domain socket to unlink(2).

Upon doing a reconfigure the router process will call
nxt_router_listen_socket_release() which will close the socket, we
extend this function in the case of non-abstract Unix domain sockets, so
that it will send a message to the main process containing a copy of the
nxt_sockaddr_t structure that will contain the filename of the socket.

In the main process the handler that we have defined,
nxt_main_port_socket_unlink_handler(), for this message type will run
and allow us to look for the socket in question in the listen_sockets
array and remove it and unlink(2) the socket.

This then allows the reconfigure to work if it tries to bind(2) again to
a socket that previously existed.

Link: <https://github.com/nginx/unit/issues/669>
Link: <https://github.com/nginx/unit/pull/735>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:28:23 +00:00
Andrew Clayton
8f0192bb4c Remove some dormant code from nxt_process_quit().
In nxt_process_quit() there is a loop that iterates over the
task->thread->runtime->listen_sockets array and closes the connections.

This code has been there from the beginning

  $ git log --pretty=oneline -S'if (rt->listen_sockets != NULL)'
  e9e5ddd5a5 Refactor of process management.
  6f2c9acd18 Processes refactoring. The cycle has been renamed to the runtime.
  $ git log --pretty=oneline -S'if (cycle->listen_sockets != NULL) {'
  6f2c9acd18 Processes refactoring. The cycle has been renamed to the runtime.
  16cbf3c076 Initial version.

but never seems to have been used (AFAICT and certainly not recently,
confirmed by code inspection and running pytests with a bunch of
language modules enabled and the code in question was never executed) as
the listen_sockets array has never been populated... until now.

The previous commit now adds Unix domain sockets to this array so that
they can be unlink(2)'d upon exit and reconfiguration.

This has now caused this dormant code to become active as it now tries
to close these sockets (from at least the prototype processes), this
array is inherited via fork by other processes.

The file descriptor for these sockets is set to -1 when they are put
into this array. This then results in close(-1) calls which caused
multiple failures in the pytests such as

  >       assert not alerts, 'alert(s)'
  E       AssertionError: alert(s)
  E       assert not ['2023/03/09 23:26:14 [alert] 137673#137673 socket close(-1) failed (9: Bad file descriptor)']

I think the simplest thing is to just remove this code.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:03:56 +00:00
Andrew Clayton
ccaad38bc5 Socket: Remove Unix domain listen sockets at shutdown.
If we don't remove the Unix domain listen socket file then when Unit
restarts it get an error like

  2023/02/25 23:10:11 [alert] 36388#36388 bind(\"unix:/tmp/unit.sock\") failed (98: Address already in use)

This patch makes use of the listen_sockets array, that is already
allocated in the main process but never populated, to place the Unix
domain listen sockets into.

At shutdown we can then loop through this array and unlink(2) any Unix
domain sockets found therein.

Closes: <https://github.com/nginx/unit/issues/792>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17 04:03:56 +00:00
Andrew Clayton
172ceba5b6 Router: More accurately allocate request buffer memory.
In nxt_router_prepare_msg() we create a buffer (nxt_unit_request_t *req)
that gets sent to an application process that contains details about a
client request.

This buffer was always a little larger than needed due to allocating space
for the remote address _and_ port and the local address _and_ port. We
also allocate space for the local port separately.

->{local,remote}->length includes the port number and ':' and also the
'[]' for IPv6. E.g [2001:db8::1]:8080

->{local,remote}->address_length represents the length of the unadorned
IP address. E.g 2001:db8::1

Update the buffer size so that we only allocate what is actually needed.

Suggested-by: Zhidao HONG <z.hong@f5.com>
Cc: Zhidao HONG <z.hong@f5.com>
Reviewed-by: Zhidao HONG <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-14 17:07:08 +00:00
Andrei Zeliankou
28bdeec614 Tests: added Perl test with many responses using streaming body. 2023-03-14 11:06:09 +00:00
Andrew Clayton
fa81d7a11a Perl: Fix a crash in the language module.
User @bes-internal reported a Perl module crasher on GitHub.

This was due to a Perl application sending back two responses, for each
response we would call down into XS_NGINX__Unit__Sandbox_cb(), the first
time pctx->req would point to a valid nxt_unit_request_info_t, the
second time pctx->req would be NULL.

Add an invalid responses check which covers this case.

Closes: <https://github.com/nginx/unit/issues/841>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-10 21:40:28 +00:00
Andrew Clayton
78e1122a3c Router: Fix allocation of request buffer sent to application.
This fixes an issue reported by @Peter2121 on GitHub.

In nxt_router_prepare_msg() we create a buffer (nxt_unit_request_t *req)
that gets sent to an application process that contains details about a
client request.

The req structure comprises various members with the final member being
an array (specified as a flexible array member, with its actual length
denoted by the req->fields_count member) of nxt_unit_field_t's. These
structures specify the length and offset for the various request headers
name/value pairs which are stored after some request metadata that is
stored immediately after this array of structs as individual nul
terminated strings.

After this we have the body content data (if any). So it looks a little
like

  (gdb) x /64bs 0x7f38c976e060
  0x7f38c976e060: "\353\346\244\t\006"		<-- First nxt_unit_field_t
  0x7f38c976e066: ""
  0x7f38c976e067: ""
  0x7f38c976e068: "T\001"
  0x7f38c976e06b: ""
  0x7f38c976e06c: "Z\001"
  0x7f38c976e06f: ""
  ...
  0x7f38c976e170: "\362#\244\v$"		<-- Last nxt_unit_field_t
  0x7f38c976e176: ""
  0x7f38c976e177: ""
  0x7f38c976e178: "\342\002"
  0x7f38c976e17b: ""
  0x7f38c976e17c: "\352\002"
  0x7f38c976e17f: ""
  0x7f38c976e180: "POST"			<-- Start of request metadata
  0x7f38c976e185: "HTTP/1.1"
  0x7f38c976e18e: "unix:"
  0x7f38c976e194: "unix:/dev/shm/842.sock"
  0x7f38c976e1ab: ""
  0x7f38c976e1ac: "fedora"
  0x7f38c976e1b3: "/842.php"
  0x7f38c976e1bc: "HTTP_HOST"			<-- Start of header fields
  0x7f38c976e1c6: "fedora"
  0x7f38c976e1cd: "HTTP_X_FORWARDED_PROTO"
  0x7f38c976e1e4: "https"
  ...
  0x7f38c976e45a: "HTTP_COOKIE"
  0x7f38c976e466: "PHPSESSID=8apkg25r9s9vju3pi085i21eh4"
  0x7f38c976e48b: "public_form=sended"		<-- Body content

Well that's how things are supposed to look! When using Unix domain
sockets what we actually got looked like

  ...
  0x7f6141f3445a: "HTTP_COOKIE"
  0x7f6141f34466: "PHPSESSID=uo5b2nu9buijkc89jotbgmd60vpublic_form=sended"

Here, the body content (from a POST for example) has been appended
straight onto the end of the last header field value. In this case
corrupting the PHP session cookie.  The body content would still be
found by the application as its offset into this buffer is correct.

This problem was actually caused by a0327445 ("PHP: allowed to specify
URLs without a trailing '/'.") which added an extra item into this
request buffer specifying the port number that unit is listening on that
handled this request.

Unfortunately when I wrote that patch I didn't increase the size of this
request buffer to accommodate it.

When using normal TCP sockets we actually end up allocating more space
than required for this buffer, we track the end of this buffer up to
where the body content would go and so we have a few spare bytes between
the nul byte of the last field header value and the start of the body
content.

When using Unix domain sockets, they have no associated port number and
thus the port number has a length of 0 bytes, but we still write a '\0'
in there using up a byte that we didn't account for, this causes us to
loose the nul byte of the last header fields value causing the body data
to be appended to the last header field value.

The fix is simple, account for the local port length, we also add 1 to
it, this covers the nul byte, even if there is no port as with Unix
domain sockets.

Closes: <https://github.com/nginx/unit/issues/842>
Fixes: a0327445 ("PHP: allowed to specify URLs without a trailing '/'.")
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-10 21:39:57 +00:00
Andrei Zeliankou
4ed4283cff Tests: _clear_temp_dir() function introduced.
Also added temporary directory clearing after checking available
modules to prevent garbage environment when tests start.
2023-03-07 14:47:39 +00:00
Dave McAllister
d13c4fb503 Adding the NGINX Code of Conduct to the repo.
This adds the NGINX Code of Conduct file to the repo, one of the
community guides requested and tracked for community health by the NGINX
community and by GitHub insights.

[ Re-flowed the commit message - Andrew ]
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-07 13:05:41 +00:00
Konstantin Pavlov
1d515879f1 contrib: fixed njs make rule. 2023-02-28 10:42:18 -08:00
Andrei Zeliankou
814815a3c5 Merged with the 1.29 branch. 2023-03-01 18:25:52 +00:00
Andrei Zeliankou
32c9b8c3cd Added missing fixes in changes.xml. 2023-02-28 13:42:39 +00:00
Konstantin Pavlov
132f7d18a2 contrib: updated njs to 0.7.10. 2023-02-27 15:45:04 -08:00
Liam Crilly
bbeccabe0a Tools: improved detection of unitd control socket.
Now unitc obtains the path to the unitd binary from information
contained in the unitd: main process. If unitd was started with an
explicit path then that path will be used to obtain the default
control socket, instead of using the unitd binary in $PATH.
2023-02-27 14:25:58 +00:00
Andrew Clayton
5ed6eae718 Set a safer umask(2) when running as a daemon.
When running as a daemon. unit currently sets umask(0), i.e no umask.
This is resulting in various directories being created with a mode of
0777, e.g

  rwxrwxrwx

this is currently affecting cgroup and rootfs directories, which are
being created with a mode of 0777, and when running as a daemon as there
is no umask to restrict the permissions.

This also affects the language modules (the umask is inherited over
fork(2)) whereby unless something explicitly sets a umask, files and
directories will be created with full permissions, 0666 (rw-rw-rw-)/
0777 (rwxrwxrwx) respectively.

This could be an unwitting security issue.

My original idea was to just remove the umask(0) call and thus inherit
the umask from the executing shell/program.

However there was some concern about just inheriting whatever umask was
in effect.

Alex suggested that rather than simply removing the umask(0) call we
change it to a value of 022 (which is a common default), which will
result in directories and files with permissions at most of 0755
(rwxr-xr-x) & 0644 (rw-r--r--).

If applications need some other umask set, they can (as they always have
been able to) set their own umask(2).

Suggested-by: Alejandro Colomar <alx.manpages@gmail.com>
Reviewed-by: Liam Crilly <liam@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-24 15:48:15 +00:00
Andrew Clayton
ffa86b6edc Isolation: rootfs: Set the sticky bit on the tmp directory.
When using the 'rootfs' isolation option, by default a tmpfs filesystem
is mounted on tmp/. Currently this is mounted with a mode of 0777, i.e

  drwxrwxrwx.   3 root   root   60 Feb 22 11:56 tmp

however this should really have the sticky bit[0] set (as is per-normal for
such directories) to prevent users from having free reign on the files
contained within.

What we really want is it mounted with a mode of 01777, i.e

  drwxrwxrwt.   3 root   root   60 Feb 22 11:57 tmp

[0]: To quote inode(7)

 "The sticky bit (S_ISVTX) on a directory means that a file in that
  directory can be renamed or deleted only by the owner of the file, by
  the owner of the directory, and by a privileged process."

Reviewed-by: Liam Crilly <liam@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-24 15:48:00 +00:00
Andrei Zeliankou
7934dcabbc Tests: switched to using f-strings.
Previously, it was necessary to support older versions of Python for
compatibility.  F-strings were released in Python 3.6.  Python 3.5 was
marked as unsupported by the end of 2020, so now it's possible to start
using f-strings safely for better readability and performance.
2023-02-21 17:21:29 +00:00
Andrei Zeliankou
fcabbf09d8 Tests: added Python tests with encoding. 2023-02-21 15:35:38 +00:00
Andrei Zeliankou
7f046c80b9 Tests: removed list usage as default argument.
Mutable types as default arguments is bad practice since
they are evaluated only once when the function is defined.
2023-02-21 14:40:47 +00:00
Alejandro Colomar
e249dd4727 Tools: using nicer characters for showing a tree.
Especially in small trees, ASCII characters are confusing.  Use nicer
UTF-8 characters, which are more readable to the audience of this
script.  We don't expect the audience of this script to have limited
environments where these characters will not be shown, but if that
happens, we could improve the script to select the caracters based on
the locale.

Suggested-by: Liam Crilly <liam@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2023-02-20 14:12:38 +01:00
Andrew Clayton
96f33c0395 Remove the nxt_getpid() alias.
Since the previous commit, nxt_getpid() is only ever aliased to
getpid(2).

nxt_getpid() was only used once in the code, while there are multiple
direct uses of getpid(2)

  $ grep -r "getpid()" src/
  src/nxt_unit.c:    nxt_unit_pid = getpid();
  src/nxt_process.c:    nxt_pid = nxt_getpid();
  src/nxt_process.c:    nxt_pid = getpid();
  src/nxt_lib.c:    nxt_pid = getpid();
  src/nxt_process.h:#define nxt_getpid()                                                          \
  src/nxt_process.h:#define nxt_getpid()                                                          \
  src/nxt_process.h:    getpid()

Just remove it and convert the _single_ instance of nxt_getpid() to
getpid(2).

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-17 21:24:18 +00:00
Andrew Clayton
3351bb4ff5 Isolation: Remove the syscall(SYS_getpid) wrapper.
When using SYS_clone we used the getpid kernel system call directly via
syscall(SYS_getpid) to avoid issues with cached pids.

However since we are now only using fork(2) (+ unshare(2) for
namespaces) we no longer need to call the kernel getpid directly as the
fork(2) will ensure the cached pid is invalidated.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-17 21:24:18 +00:00
Andrew Clayton
1388d311c6 Isolation: Remove nxt_clone().
Since the previous commit, this is no longer used.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-17 21:24:18 +00:00
Andrew Clayton
b0e2d9d0a1 Isolation: Switch to fork(2) & unshare(2) on Linux.
On GitHub, @razvanphp & @hbernaciak both reported issues running the
APCu PHP module under Unit.

When using this module they were seeing errors like

  'apcu_fetch(): Failed to acquire read lock'

However when running APCu under php-fpm, everything was fine.

The issue turned out to be due to our use of SYS_clone breaking the
pthreads(7) API used by APCu.  Even if we had been using glibc's
clone(2) wrapper we would still have run into problems due to a known
issue there.

Essentially the problem is when using clone, glibc doesn't update the
TID cache, so the child ends up having the same TID as the parent and
that is used in various parts of pthreads(7) such as in the various
locking primitives, so when APCu was grabbing a lock it ended up using
the TID of the main unit process (rather than that of the php
application processes that was grabbing the lock).

So due to the above what was happening was when one of the application
processes went to grab either a read or write lock, the lock was
actually being attributed to the main unit process.  If a process had
acquired the write lock, then if a process tried to acquire a read or
write lock then glibc would return EDEADLK due to detecting a deadlock
situation due to thinking the process already held the write lock when
in fact it didn't.

It seems the right way to do this is via fork(2) and unshare(2).  We
already use fork(2) on other platforms.

This requires a few tricks to keep the essence of the processes the same
as before when using clone

  1) We use the prctl(2) PR_SET_CHILD_SUBREAPER option (if its
     available, since Linux 3.4) to make the main unit process inherit
     prototype processes after a double fork(2), rather than them being
     reparented to 'init'.

     This avoids needing to ^C twice to fully exit unit when running in
     the foreground.  It's probably also better if they maintain their
     parent child relationship where possible.

  2) We use a double fork(2) technique on the prototype processes to
     ensure they themselves end up in a new PID namespace as PID 1 (when
     CLONE_NEWPID is being used).

     When using unshare(CLONE_NEWPID), the calling process is _not_
     placed in the namespace (as discussed in pid_namespaces(7)).  It
     only sets things up so that subsequent children are placed in a PID
     namespace.

     Having the prototype processes as PID 1 in the new PID namespace is
     probably a good thing and matches the behaviour of clone(2).  Also,
     some isolation tests break if the prototype process is not PID 1.

  3) Due to the above double fork(2) the main unit process looses track
     of the prototype process ID, which it needs to know.

     To solve this, we employ a simple pipe(2) between the main unit and
     prototype processes and pass the prototype grandchild PID from the
     parent of the second fork(2) before exiting.  This needs to be done
     from the parent and not the grandchild, as the grandchild will see
     itself having a PID of 1 while the main process needs its
     externally visible PID.

Link: <https://www.php.net/manual/en/book.apcu.php>
Link: <https://sourceware.org/bugzilla/show_bug.cgi?id=21793>
Closes: <https://github.com/nginx/unit/issues/694>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-17 21:24:18 +00:00
Andrew Clayton
d98a1b0dd7 Enable the PR_SET_CHILD_SUBREAPER prctl(2) option on Linux.
This prctl(2) option can be used to set the "child subreaper" attribute
of the calling process.  This allows a process to take on the role of
'init', which means the process will inherit descendant processes when
their immediate parent terminates.

This will be used in an upcoming commit that uses a double fork(2) +
unshare(2) to create a new PID namespace.  The parent from the second
fork will terminate leaving the child process to be inherited by 'init'.
Aside from it being better to maintain the parent/child relationships
between the various unit processes, without setting this you need to ^C
twice to fully quit unit when running in the foreground after the double
fork.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-17 21:24:18 +00:00
Andrew Clayton
3ecdd2c69c Isolation: Rename NXT_HAVE_CLONE -> NXT_HAVE_LINUX_NS.
Due to the need to replace our use of clone/__NR_clone on Linux with
fork(2)/unshare(2) for enabling Linux namespaces(7) to keep the
pthreads(7) API working.  Let's rename NXT_HAVE_CLONE to
NXT_HAVE_LINUX_NS, i.e name it after the feature, not how it's
implemented, then in future if we change how we do namespaces again we
don't have to rename this.

Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-17 21:24:18 +00:00
Andrew Clayton
763396b8be Isolation: Fix the enablement of PR_SET_NO_NEW_PRIVS.
This prctl(2) option is checked for in auto/isolation, unfortunately due
to a typo this feature has never been enabled.

In the auto/isolation script the feature name was down as
NXT_HAVE_PR_SET_NO_NEW_PRIVS0, which means we end up with the following
in build/nxt_auto_config.h

  #ifndef NXT_HAVE_PR_SET_NO_NEW_PRIVS0
  #define NXT_HAVE_PR_SET_NO_NEW_PRIVS0  1
  #endif

Whereas everywhere else is checking for NXT_HAVE_PR_SET_NO_NEW_PRIVS.

This also guards the inclusion of sys/prctl.h in src/nxt_process.c which
is required by a subsequent commit.

Fixes: e2b53e1 ("Added "rootfs" feature.")
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-17 21:24:18 +00:00