I needed to reload my network interface on the machine where my PostgreSQL database is hosted and, after doing that, my thread got forever blocked on the getNotifications(...)
call.
"Loritta PostgreSQL Notification Listener" #261 daemon prio=5 os_prio=0 cpu=48.89ms elapsed=62372.91s tid=0x00007f45f806a460 nid=0xf08b5 runnable [0x00007f45d6dfc000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.Net.poll([email protected]/Native Method)
at sun.nio.ch.NioSocketImpl.park([email protected]/NioSocketImpl.java:181)
at sun.nio.ch.NioSocketImpl.timedRead([email protected]/NioSocketImpl.java:285)
at sun.nio.ch.NioSocketImpl.implRead([email protected]/NioSocketImpl.java:309)
at sun.nio.ch.NioSocketImpl.read([email protected]/NioSocketImpl.java:350)
at sun.nio.ch.NioSocketImpl$1.read([email protected]/NioSocketImpl.java:803)
at java.net.Socket$SocketInputStream.read([email protected]/Socket.java:966)
at sun.security.ssl.SSLSocketInputRecord.read([email protected]/SSLSocketInputRecord.java:478)
at sun.security.ssl.SSLSocketInputRecord.readHeader([email protected]/SSLSocketInputRecord.java:472)
at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket([email protected]/SSLSocketInputRecord.java:70)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord([email protected]/SSLSocketImpl.java:1455)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read([email protected]/SSLSocketImpl.java:1059)
at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:161)
at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:128)
at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:113)
at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
at org.postgresql.core.PGStream.receiveChar(PGStream.java:453)
at org.postgresql.core.v3.QueryExecutorImpl.processNotifies(QueryExecutorImpl.java:789)
- locked <0x0000000621293410> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc.PgConnection.getNotifications(PgConnection.java:1107)
at net.perfectdreams.loritta.cinnamon.pudding.utils.PostgreSQLNotificationListener.run(PostgreSQLNotificationListener.kt:29)
at java.lang.Thread.run([email protected]/Thread.java:833)
So I went out and tried figuring out if this could be reproduced, and found out that someone had already reported this issue, but it was closed due to "lack of feedback". Anyhow, here's my investigations and explanations to anyone else wondering why their PostgreSQL getNotifications
call is not receiving new notifications, even tho it is blocked on the getNotifications
call!
To debug this, I ran a PostgreSQL container using Docker, so we can abuse the docker pause
feature to simulate a network failure. (because blocking network connections on Windows is hard)
docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres
fun main() {
// 172.24.48.1 = WSL2 IP
val url = "jdbc:postgresql://172.24.61.21:55432/postgres?user=postgres&password=mysecretpassword"
val connection = DriverManager.getConnection(url) as PgConnection
while (true) {
// I know that we aren't listening for any channels, this doesn't matter
println("Blocked waiting for notifications")
connection.getNotifications(0)
}
}
If we shutdown PostgreSQL with docker stop some-postgres
...
Blocked waiting for notifications
Exception in thread "main" org.postgresql.util.PSQLException: FATAL: terminating connection due to administrator command
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675)
at org.postgresql.core.v3.QueryExecutorImpl.processNotifies(QueryExecutorImpl.java:800)
at org.postgresql.jdbc.PgConnection.getNotifications(PgConnection.java:1107)
at net.perfectdreams.loritta.cinnamon.discord.webserver.TestKt.main(Test.kt:16)
at net.perfectdreams.loritta.cinnamon.discord.webserver.TestKt.main(Test.kt)
Okay, so, what if we docker pause some-postgres
?
Blocked waiting for notifications
Hmmm... so the thread keeps blocking on the getNotifications(0)
call, interesting...
Let's unpause the container with docker unpause some-postgres
and add a notification timeout to the getNotifications(...)
call
fun main() {
// 172.24.48.1 = WSL2 IP
val url = "jdbc:postgresql://172.24.61.21:55432/postgres?user=postgres&password=mysecretpassword"
val connection = DriverManager.getConnection(url) as PgConnection
while (true) {
// I know that we aren't listening for any channels, this doesn't matter
println("Blocked waiting for notifications")
connection.getNotifications(5_000)
}
}
Let's run it, pause the container with docker pause some-postgres
...
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Blocked waiting for notifications
Hmmm, okay, so it keeps looping the while (true)
check, even tho the PostgreSQL database is not active, so now what?
@ioannisioannou16 recommended using the isValid
method to check if the connection is open AND adding a timeout to the getNotifications
call, so let's do it!
fun main() {
// 172.24.48.1 = WSL2 IP
val url = "jdbc:postgresql://172.24.61.21:55432/postgres?user=postgres&password=mysecretpassword"
val connection = DriverManager.getConnection(url) as PgConnection
while (true) {
println("Checking if the connection is valid...")
val isValid = connection.isValid(5) // This is in seconds!!
println("Connection is valid? $isValid")
if (!isValid)
break
// I know that we aren't listening for any channels, this doesn't matter
println("Blocked waiting for notifications")
val notifications = connection.getNotifications(5_000)
println("Notifications: $notifications")
}
}
And now it works: If the container is paused while the getNotifications(...)
call is active, it will unblock after a while and then run the loop again, where then the isValid
call will wait 5s until the connection is considered "valid" and, if it isn't, the loop will exit.
Checking if the connection is valid...
Connection is valid? true
Blocked waiting for notifications
Notifications: [Lorg.postgresql.PGNotification;@77846d2c
Checking if the connection is valid...
Connection is valid? true
Blocked waiting for notifications
Notifications: [Lorg.postgresql.PGNotification;@548ad73b
Checking if the connection is valid...
Connection is valid? false
And now you may be wondering: Why getNotifications(...)
doesn't fail if a network issue happens? If we are having a network issue, we don't receive a notification from PostgreSQL saying "hey i'm shutting down ok pls disconnect", in fact, the connection receives ✨ nothing at all ✨!
And notifications are received asynchronously, in fact, the JDBC driver doesn't send any queries to the PostgreSQL database indicating that "hey, pls send new notifications to me!!" when calling getNotifications(...)
, it just blocks until any data is received from the database.
So that's where the issue lies:
getNotifications(...)
doesn't send any data through the connection.So a quick and dirty fix is doing what @ioannisioannou16 suggested: Add a timeout to getNotifications(...)
, call isValid(...)
with a timeout too, if isValid(...)
returns false, then it means that the connection is dead.
The disadvantage of this solution is that the thread needs to be unblocked every once in a while to check if the connection is still alive.
Maybe the jdbc docs should be updated to mention this issue? 😊 https://jdbc.postgresql.org/documentation/head/listennotify.html