Commit graph

176 commits

Author SHA1 Message Date
Alejandro Colomar
7e4a8a5422 Disallowed abstract unix socket syntax in non-Linux systems.
The previous commit added/fixed support for abstract Unix domain sockets
on Linux with a leading '@' or '\0'.  To be consistent in all platforms,
treat those prefixes as markers for abstract sockets in all platforms,
and fail if abstract sockets are not supported by the platform.

That will avoid mistakes when copying a config file from a Linux system
and using it in non-Linux, which would surprisingly create a normal socket.
2022-08-18 18:58:41 +02:00
Alejandro Colomar
d8e0768a5b Fixed support for abstract Unix sockets.
Unix domain sockets are normally backed by files in the
filesystem.  This has historically been problematic when closing
and opening again such sockets, since SO_REUSEADDR is ignored for
Unix sockets (POSIX left the behavior of SO_REUSEADDR as
implementation-defined, and most --if not all-- implementations
decided to just ignore this flag).

Many solutions are available for this problem, but all of them
have important caveats:

- unlink(2) the file when it's not needed anymore.

  This is not easy, because the process that controls the fd may
  not be the same process that created the file, and may not have
  file permissions to remove it.

  Further solutions can be applied to that caveat:

  - unlink(2) the file right after creation.

    This will remove the pathname from the filesystem without
    closing the socket (it will continue to live until the last fd
    is closed).  This is not useful for us, since we need the
    pathname of the socket as its interface.

  - chown(2) or chmod(2) the directory that contains the socket.

    For removing a file from the filesystem, a process needs
    write permissions in the containing directory.  We could
    put sockets in dummy directories that can be chown(2)ed to
    nobody.  This could be dangerous, though, as we don't control
    the socket names.  It is our users who configure the socket
    name in their configuration, and so it's easy that they don't
    understand the many implications of not chosing an appropriate
    socket pathname.  A user could unknowingly put the socket in a
    directory that is not supposed to be owned by user nobody, and
    if we blindly chown(2) or chmod(2) the directory, we could be
    creating a big security hole.

  - Ask the main process to remove the socket.

    This would require a very complex communication mechanism with
    the main process, which is not impossible, but let's avoid it
    if there are simpler solutions.

  - Give the child process the CAP_DAC_OVERRIDE capability.

    That is one of the most powerful capabilities.  A process with
    that capability can be considered root for most practical
    aspects.  Even if the capability is disabled for most of the
    lifetime of the process, there's a slight chance that a
    malicious actor could activate it and then easily do serious
    damage to the system.

- unlink(2) the file right before calling bind(2).

  This is dangerous because another process (for example, another
  running instance of unitd(8)), could be using the socket, and
  removing the pathname from the filesystem would be problematic.
  To do this correctly, a lot of checks should be added before the
  actual unlink(2), which is error-prone, and difficult to do
  correctly, and atomically.

- Use abstract-namespace Unix domain sockets.

  This is the simplest solution, as it only requires accepting a
  slightly different syntax (basically a @ prefix) for the socket
  name, to transform it into a string starting with a null byte
  ('\0') that the kernel can understand.  The patch is minimal.

  Since abstract sockets live in an abstract namespace, they don't
  create files in the filesystem, so there's no need to remove
  them later.  The kernel removes the name when the last fd to it
  has been closed.

  One caveat is that only Linux currently supports this kind of
  Unix sockets.  Of course, a solution to that could be to ask
  other kernels to implement such a feature.

  Another caveat is that filesystem permissions can't be used to
  control access to the socket file (since, of course, there's no
  file).  Anyone knowing the socket name can access to it.  The
  only method to control access to it is by using
  network_namespaces(7).  Since in unitd(8) we're using 0666 file
  sockets, abstract sockets should be no more insecure than that
  (anyone can already read/write to the listener sockets).

- Ask the kernel to implement a simpler way to unlink(2) socket
  files when they are not needed anymore.  I've suggested that to
  the <linux-fsdevel@vger.kernel.org> mailing list, in:
<lore.kernel.org/linux-fsdevel/0bc5f919-bcfd-8fd0-a16b-9f060088158a@gmail.com/T>

In this commit, I decided to go for the easiest/simplest solution,
which is abstract sockets.  In fact, we already had partial
support.  This commit only fixes some small bug in the existing
code so that abstract Unix sockets work:

- Don't chmod(2) the socket if it's an abstract one.

This fixes the creation of abstract sockets, but doesn't make them
usable, since we produce them with a trailing '\0' in their name.
That will be fixed in the following commit.

This closes #669 issue on GitHub.
2022-08-18 18:55:14 +02:00
Max Romanov
900828cc4b Fixing isolated process PID manipulation.
Registering an isolated PID in the global PID hash is wrong
because it can be duplicated.  Isolated processes are stored only
in the children list until the response for the WHOAMI message is
processed and the global PID is discovered.

To remove isolated siblings, a pointer to the children list is
introduced in the nxt_process_init_t struct.

This closes #633 issue on GitHub.
2022-08-11 13:33:46 +01:00
Alejandro Colomar
d37b76232e Put changes entry in the correct position. 2022-08-08 12:13:28 +02:00
Zhidao HONG
3f8cf62c03 Log: customizable access log format. 2022-07-28 11:05:04 +08:00
Zhidao HONG
2bd4a45527 Ruby: fixed segfault on SIGTERM signal.
This closes #562 issue on GitHub.
2022-07-28 11:00:15 +08:00
Alejandro Colomar
9b4b4925b3 Ruby: fixed contents of SCRIPT_NAME.
Having the basename of the script pathname was incorrect.  While
we don't have something more accurate, the best thing to do is to
have it empty (which should be the right thing most of the time).

This closes #715 issue on GitHub.

The bug was introduced in git commit
0032543fa6
'Ruby: added the Rack environment parameter "SCRIPT_NAME".'.
2022-07-27 12:46:42 +02:00
Alejandro Colomar
91ffd08d11 Fixed line removed by accident.
When fixing conflicts in the changelog, a line was removed by accident.

Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
2022-07-26 16:58:15 +02:00
Alejandro Colomar
6e36584a2e Supporting UNIX sockets in address matching.
This closes #645 issue on GitHub.

(Also moved a changelog line that was misplaced in a previous commit.)
2022-07-26 16:24:33 +02:00
Andrew Clayton
eebaff42ea Var: added a $dollar variable that translates to a '$'.
Allow $dollar (or ${dollar}) to translate to a literal $ to allow
support for sub-delimiters in URIs.

It is possible to have URLs like

  https://example.com/path/15$1588/9925$2976.html

and thus it would be useful to be able to specify them in various bits
of the unit config such as the location setting.

However this hadn't been possible due to $ being used to denote
variables for substitution. E.g $host.

As was noted in the below GitHub issue it was suggested by @VBart to
use $sign to represent a literal $, however I feel $dollar is more
appropriate so we have a variable named after the thing it represents,
also @tippexs found[0] that &dollar is used in HTML to represent a $, so
there is some somewhat related precedent.

(The other idea to use $$ was rejected in my original pull-request[1]
 for this issue.)

This means the above URL could be specified as

  https://example.com/path/15${dollar}1588/9925${dollar}2976.html

in the unit config.

This is done by adding a variable called 'dollar' which is loaded into
the variables hash table which translates into a literal $.

This is then handled in nxt_var_next_part() where variables are parsed
for lookup and $dollar is set for substitution by a literal '$'. Actual
variable substitution happens in nxt_var_query_finish().

[0]: https://github.com/nginx/unit/pull/693#issuecomment-1130412323
[1]: https://github.com/nginx/unit/pull/693

Closes: https://github.com/nginx/unit/issues/675
2022-07-20 23:28:02 +01:00
Zhidao HONG
8c5e2d5ce5 HTTP: added more variables.
This commit adds the following variables:
$remote_addr, $time_local, $request_line, $status,
$body_bytes_sent, $header_referer, $header_user_agent.
2022-07-14 04:34:05 +08:00
Zhidao HONG
45b89e3257 Var: dynamic variables support.
This commit adds the variables $arg_NAME, $header_NAME, and $cookie_NAME.
2022-07-14 04:32:49 +08:00
Timo Stark
f83aef1aab Increased readtimeout for configuration endpoint.
Closes: <https://github.com/nginx/unit/issues/676>
2022-07-02 14:44:05 +02:00
Alejandro Colomar
c3e40ae932 Static: Fixed finding the file extension.
The code for finding the extension made a few assumptions that are
no longer true.  It didn't account for pathnames that didn't
contain '/', including the empty string, or the NULL string.  That
code was used with "share", which always had a '/', but now it's
also used with "index", which should not have a '/' in it.

This fix works by limiting the search to the beginning of the
string, so that if no '/' is found in it, it doesn't continue
searching before the beginning of the string.

This also happens to work for NULL.  It is technically Undefined
Behavior, as we rely on `NULL + 0 == NULL` and `NULL - NULL == 0`.
But that is the only sane behavior for an implementation, and all
existing POSIX implementations will Just Work for this code.

Relying on this UB is useful, because we don't need to add an
explicit check for NULL, and therefore we have faster code.
Although the current code can't have a NULL, I expect that when we
add support for variables in the index, it will be NULL in some
cases.

Link: <https://stackoverflow.com/q/67291052/6872717>

The same code seems to be defined behavior in C++, which normally
will share implementation in the compiler for these cases, and
therefore it is really unlikely to be in trouble.

Link: <https://stackoverflow.com/q/59409034/6872717>
2022-06-21 12:47:01 +02:00
Konstantin Pavlov
e42c52cff6 Switched changelogs to packaging alias instead of personal emails. 2022-06-20 18:21:43 +04:00
Zhidao HONG
9d2672a701 Router: forwared header replacement. 2022-06-20 13:22:13 +08:00
Andrei Zeliankou
7e64971cbe Version bump. 2022-06-17 09:46:30 +01:00
Andrei Zeliankou
862f51bcd8 Specified date of 1.27.0 release in changes.xml. 2022-06-08 13:12:51 +01:00
Andrei Zeliankou
bd80039e07 Node.js: fixed ES modules format in loader.mjs.
Before Node.js v16.14.0 the "format" value in defaultResolve
was ignored so error was hidden.  For more information see:
https://github.com/nodejs/node/pull/40980
2022-06-02 11:48:27 +01:00
Artem Konev
0d5d81b271 Fixed minor issues in "changes.xml". 2022-06-01 14:54:13 +01:00
Alejandro Colomar
9bf614cd08 Var: Added $request_uri (as in NGINX).
This supports a new variable $request_uri that contains the path
and the query (See RFC 3986, section 3).  Its contents are percent
encoded.  This is useful for example to redirect HTTP to HTTPS:

{
    "return": "301",
    "location": "https://$host$request_uri"
}

When <http://example.com/foo%23bar?baz> is requested, the server
redirects to <https://example.com/foo%23bar?baz>.

===

Testing:

//diff --git a/src/nxt_http_return.c b/src/nxt_http_return.c
//index 82c9156..adeb3a1 100644
//--- a/src/nxt_http_return.c
//+++ b/src/nxt_http_return.c
//@@ -196,6 +196,7 @@ nxt_http_return_send_ready(nxt_task_t *task,
    void *obj, void *data)
//         field->value = ctx->encoded.start;
//         field->value_length = ctx->encoded.length;
//     }
//+    fprintf(stderr, "ALX: target[%1$i]: <%2$.*1$s>\n",
    (int)r->target.length, r->target.start);
//
//     r->state = &nxt_http_return_send_state;
//

{
	"listeners": {
		"*:81": {
			"pass": "routes/ru"
		}
	},

	"routes": {
		"ru": [{
			"action": {
				"return": 301,
				"location": "$request_uri"
			}
		}]
	}
}

$ curl -i http://localhost:81/*foo%2Abar?baz#arg
HTTP/1.1 301 Moved Permanently
Location: /*foo%2Abar?baz
Server: Unit/1.27.0
Date: Mon, 30 May 2022 16:04:30 GMT
Content-Length: 0

$ sudo cat /usr/local/unit.log | grep ALX
ALX: target[15]: </*foo%2Abar?baz>
2022-05-31 12:40:02 +02:00
Alejandro Colomar
9af5f36951 Static: supporting new "index" option.
This supports a new option "index" that configures a custom index
file name to be served when a directory is requested.  This
initial support only allows a single fixed string.  An example:

{
	"share": "/www/data/static/$uri",
	"index": "lookatthis.htm"
}

When <example.com/foo/bar/> is requested,
</www/data/static/foo/bar/lookatthis.html> is served.

Default is "index.html".

===

nxt_conf_validator.c:

Accept "index" as a member of "share", and make sure it's a string.

===

I tried this feature in my own computer, where I tried the
following:

- Setting "index" to "lookatthis.htm", and check that the correct
  file is being served (check both a different name and a
  different extension).
- Not setting "index", and check that <index.html> is being
  served.
- Settind "index" to an array of strings, and check that the
  configuration fails:

{
	"error": "Invalid configuration.",
	"detail": "The \"index\" value must be a string, but not an array."
}
2022-05-30 12:42:18 +02:00
Alejandro Colomar
7066acb2ce Supporting empty Location URIs.
An empty string in Location was being handled specially by not sending a
Location header.  This may occur after variable resolution, so we need to
consider this scenario.

The obsolete RFC 2616 defined the Location header as consisting of an absolute
URI <https://www.rfc-editor.org/rfc/rfc2616#section-14.30>, which cannot be an
empty string.  However, the current RFC 7231 allows the Location to be a
relative URI <https://www.rfc-editor.org/rfc/rfc7231#section-7.1.2>, and a
relative URI may be an empty string <https://stackoverflow.com/a/43338457>.

Due to these considerations, this patch allows sending an empty Location header
without handling this case specially.  This behavior will probably be more
straightforward to users, too.  It also simplifies the code, which is now more
readable, fast, and conformant to the current RFC.  We're skipping an
allocation at request time in a common case such as "action": {"return": 404}
2022-05-16 12:57:37 +02:00
Zhidao HONG
5883a2670f Ruby: added stream IO "close" required by Rack specification.
This closes #654 issue on Github.
2022-05-13 19:33:40 +08:00
Zhidao HONG
0032543fa6 Ruby: added the Rack environment parameter "SCRIPT_NAME". 2022-03-09 13:29:43 +08:00
Alejandro Colomar
6fb7777ce7 Supporting variables in "location".
............
Description:
............

Before this commit, the encoded URI could be calculated at
configuration time.  Now, since variables can only be resolved at
request time, we have different situations:

- "location" contains no variables:

  In this case, we still encode the URI in the conf structure, at
  configuration time, and then we just copy the resulting string
  to the ctx structure at request time.

- "location" contains variables:

  In this case, we compile the var string at configure time, then
  when we resolve it at request time, and then we encode the
  string.

In both cases, as was being done before, if the string is empty,
either before or after resolving variables, we skip the encoding.

...........
Usefulness:
...........

An example of why this feature may be useful is redirecting HTTP
to HTTPS with something like:

"action": {
    "return": 301,
    "location": "https://${host}${uri}"
}

.....
Bugs:
.....

This feature conflicts with the relevant RFCs in the following:

'$' is used for Unit variables, but '$' is a reserved character in
a URI, to be used as a sub-delimiter.  However, it's almost never
used as that, and in fact, other parts of Unit already conflict
with '$' being a reserved character for use as a sub-delimiter, so
this is at least consistent in that sense.  VBart suggested an
easy workaround if we ever need it: adding a variable '$sign'
which resolves to a literal '$'.

......
Notes:
......

An empty string is handled as if "location" wasn't specified at
all, so no Location header is sent.

This is incorrect, and the code is slightly misleading.

The Location header consists of a URI-reference[1], which might be
a relative one, which itself might consist of an empty string[2].

[1]: <https://www.rfc-editor.org/rfc/rfc7231#section-7.1.2>
[2]: <https://stackoverflow.com/a/43338457>

Now that we have variables, it's more likely that an empty
Location header will be requested, and we should handle it
correctly.

I think in a future commit we should modify the code to allow
differentiating between an unset "location" and an empty one,
which should be treated as any other "location" string.

.................
Testing (manual):
.................

{
  "listeners": {
    "*:80": {
      "pass": "routes/str"
    },
    "*:81": {
      "pass": "routes/empty"
    },
    "*:82": {
      "pass": "routes/var"
    },
    "*:83": {
      "pass": "routes/enc-str"
    },
    "*:84": {
      "pass": "routes/enc-var"
    }
  },
  "routes": {
    "str": [
      {
        "action": {
          "return": 301,
          "location": "foo"
        }
      }
    ],
    "empty": [
      {
        "action": {
          "return": 301,
          "location": ""
        }
      }
    ],
    "var": [
      {
        "action": {
          "return": 301,
          "location": "$host"
        }
      }
    ],
    "enc-str": [
      {
        "action": {
          "return": 301,
          "location": "f%23o#o"
        }
      }
    ],
    "enc-var": [
      {
        "action": {
          "return": 301,
          "location": "f%23o${host}#o"
        }
      }
    ]
  }
}

$ curl --dump-header - localhost:80
HTTP/1.1 301 Moved Permanently
Location: foo
Server: Unit/1.27.0
Date: Thu, 07 Apr 2022 23:30:06 GMT
Content-Length: 0

$ curl --dump-header - localhost:81
HTTP/1.1 301 Moved Permanently
Server: Unit/1.27.0
Date: Thu, 07 Apr 2022 23:30:08 GMT
Content-Length: 0

$ curl --dump-header - localhost:82
HTTP/1.1 301 Moved Permanently
Location: localhost
Server: Unit/1.27.0
Date: Thu, 07 Apr 2022 23:30:15 GMT
Content-Length: 0

$ curl --dump-header - -H "Host: bar" localhost:82
HTTP/1.1 301 Moved Permanently
Location: bar
Server: Unit/1.27.0
Date: Thu, 07 Apr 2022 23:30:23 GMT
Content-Length: 0

$ curl --dump-header - -H "Host: " localhost:82
HTTP/1.1 301 Moved Permanently
Server: Unit/1.27.0
Date: Thu, 07 Apr 2022 23:30:29 GMT
Content-Length: 0

$ curl --dump-header - localhost:83
HTTP/1.1 301 Moved Permanently
Location: f%23o#o
Server: Unit/1.27.0
Date: Sat, 09 Apr 2022 11:22:23 GMT
Content-Length: 0

$ curl --dump-header - -H "Host: " localhost:84
HTTP/1.1 301 Moved Permanently
Location: f%23o#o
Server: Unit/1.27.0
Date: Sat, 09 Apr 2022 11:22:44 GMT
Content-Length: 0

$ curl --dump-header - -H "Host: alx" localhost:84
HTTP/1.1 301 Moved Permanently
Location: f%23oalx#o
Server: Unit/1.27.0
Date: Sat, 09 Apr 2022 11:22:52 GMT
Content-Length: 0

$ curl --dump-header - -H "Host: a#l%23x" localhost:84
HTTP/1.1 301 Moved Permanently
Location: f%2523oa#l%2523x%23o
Server: Unit/1.27.0
Date: Sat, 09 Apr 2022 11:23:09 GMT
Content-Length: 0

$ curl --dump-header - -H "Host: b##ar" localhost:82
HTTP/1.1 301 Moved Permanently
Location: b#%23ar
Server: Unit/1.27.0
Date: Sat, 09 Apr 2022 11:25:01 GMT
Content-Length: 0
2022-04-28 20:40:01 +02:00
Zhidao HONG
aeed86c682 Workaround for the warning in nxt_realloc() on GCC 12.
This closes #639 issue on Github.
2022-02-22 19:18:18 +08:00
Zhidao HONG
4fcfb9d5fb Certificates: fixed crash when reallocating chain. 2022-02-14 20:14:03 +08:00
Max Romanov
2b5941df74 Python: fixing incorrect function object dereference.
The __call__ method can be native and not be a PyFunction type.  A type check
is thus required before accessing op_code and other fields.

Reproduced on Ubuntu 21.04, Python 3.9.4 and Falcon framework: here, the
App.__call__ method is compiled with Cython, so accessing op_code->co_flags is
invalid; accidentally, the COROUTINE bit is set which forces the Python module
into the ASGI mode.

The workaround is explicit protocol specification.

Note: it is impossible to specify the legacy mode for ASGI.
2022-02-08 12:04:41 +03:00
Max Romanov
818a78d82c Java: fixing multiple SCI initializations.
- Ignoring Tomcat WebSocket container initialization.
- Renaming application class loader to UnitClassLoader to avoid
development environment enablement in Spring Boot.

This closes #609 issue on GitHub.
2021-12-27 16:37:36 +03:00
Max Romanov
f845283820 Perl: creating input and error streams if closed.
Application handler can do anything with a stream object (including close it).
Once the stream is closed, Unit creates a new stream.

This closes #616 issue on GitHub.
2021-12-27 16:37:35 +03:00
Valentin Bartenev
9bc314df48 Merged with the 1.26 branch. 2021-12-03 03:10:15 +03:00
Valentin Bartenev
02f24f695c Added version 1.26.1 CHANGES. 2021-12-02 18:22:57 +03:00
Valentin Bartenev
5212d60ccf Reordered changes for 1.26.1 by significance (subjective). 2021-12-02 18:22:48 +03:00
Artem Konev
6e5dcdfe84 Fixed grammar in "changes.xml". 2021-12-02 14:12:13 +00:00
Artem Konev
d3d59249e6 Fixed grammar in "changes.xml". 2021-12-02 14:12:13 +00:00
Max Romanov
c6c74d117d Disabling SCM_CREDS usage on DragonFly BSD.
DragonFly BSD supports SCM_CREDS and SCM_RIGHTS, but only the first control
message is passed correctly while the second one isn't processed by the kernel.

This closes #599 issue on GitHub.
2021-12-01 18:06:38 +03:00
Max Romanov
2d6e926a1d Disabling SCM_CREDS usage on DragonFly BSD.
DragonFly BSD supports SCM_CREDS and SCM_RIGHTS, but only the first control
message is passed correctly while the second one isn't processed by the kernel.

This closes #599 issue on GitHub.
2021-12-01 18:06:38 +03:00
Max Romanov
64db3ef1bb Fixing prototype process crash.
A prototype stores linked application processes structures.  When an
application process terminates, it's removed from the list.  To avoid double
removal, the pointer to the next element should be set to NULL.

The issue was introduced in c8790d2a89bb.
2021-12-01 18:05:50 +03:00
Max Romanov
97e61aad73 Fixing prototype process crash.
A prototype stores linked application processes structures.  When an
application process terminates, it's removed from the list.  To avoid double
removal, the pointer to the next element should be set to NULL.

The issue was introduced in c8790d2a89bb.
2021-12-01 18:05:50 +03:00
Valentin Bartenev
d4b13c7cd5 PHP: fixed crash when calling module functions in OPcache preload.
In PHP, custom fastcgi_finish_request() and overloaded chdir() functions can be
invoked by an OPcache preloading script (it runs when php_module_startup() is
called in the app process setup handler).  In this case, there was no runtime
context set so trying to access it caused a segmentation fault.

This closes #602 issue on GitHub.
2021-11-25 19:58:54 +03:00
Valentin Bartenev
f8237911d7 PHP: fixed crash when calling module functions in OPcache preload.
In PHP, custom fastcgi_finish_request() and overloaded chdir() functions can be
invoked by an OPcache preloading script (it runs when php_module_startup() is
called in the app process setup handler).  In this case, there was no runtime
context set so trying to access it caused a segmentation fault.

This closes #602 issue on GitHub.
2021-11-25 19:58:54 +03:00
Max Romanov
7ed38c9efe Added a changelog for 730e903f4534. 2021-11-25 16:58:45 +03:00
Max Romanov
42e2105282 Added a changelog for 730e903f4534. 2021-11-25 16:58:45 +03:00
Max Romanov
1c0436d644 Fixing access_log structure reference counting.
The reference to the access_log structure is stored in the current
nxt_router_conf_t and the global nxt_router_t.  When the reference is copied,
the reference counter should be adjusted accordingly.

This closes #593 issue on GitHub.
2021-11-25 16:58:43 +03:00
Max Romanov
0af5f6ddb4 Fixing access_log structure reference counting.
The reference to the access_log structure is stored in the current
nxt_router_conf_t and the global nxt_router_t.  When the reference is copied,
the reference counter should be adjusted accordingly.

This closes #593 issue on GitHub.
2021-11-25 16:58:43 +03:00
Valentin Bartenev
f30f8f06c9 Version bump. 2021-12-02 17:16:05 +03:00
Valentin Bartenev
015610f12d Version bump. 2021-11-22 07:23:07 +03:00
Valentin Bartenev
0eaeb65edb Added version 1.26.0 CHANGES. 2021-11-18 15:48:02 +03:00
Valentin Bartenev
9b1dcc4aa6 Reordered changes for 1.26.0 by significance (subjective). 2021-11-18 15:48:02 +03:00