Subject: Re: [BUGS] Error in backend/storage/lmgr/proc.c: ProcSleep() From: Tom Lane To: Tomasz Zielonka cc: PostgreSQL bugs Date: Mon, 03 Sep 2001 22:37:11 -0400 Comments: In-reply-to Tomasz Zielonka message dated "Mon, 03 Sep 2001 06:55:57 +0200" Tomasz Zielonka writes: > Platform: PostgreSQL 7.1.3, Linux 2.4.8, egcs 2.91.66 > PostgreSQL forgets to release lock until shutdown in this scenario: Good catch! This has been broken since 7.1 ... surprising that no one discovered the problem sooner. I think that rather than removing the early-deadlock-detection code as you suggest, it's better to make it work correctly. I have applied a patch that allows RemoveFromWaitQueue() to be used, so that the recovery path is the same as if HandleDeadLock had been invoked. regards, tom lane *** /home/postgres/pgsql/src/backend/storage/lmgr/proc.c.orig Fri Jul 6 17:04:26 2001 --- /home/postgres/pgsql/src/backend/storage/lmgr/proc.c Mon Sep 3 22:26:57 2001 *************** *** 506,521 **** SPINLOCK spinlock = lockctl->masterLock; PROC_QUEUE *waitQueue = &(lock->waitProcs); int myHeldLocks = MyProc->heldLocks; PROC *proc; int i; - #ifndef __BEOS__ struct itimerval timeval, dummy; - #else bigtime_t time_interval; - #endif /* --- 506,519 ---- SPINLOCK spinlock = lockctl->masterLock; PROC_QUEUE *waitQueue = &(lock->waitProcs); int myHeldLocks = MyProc->heldLocks; + bool early_deadlock = false; PROC *proc; int i; #ifndef __BEOS__ struct itimerval timeval, dummy; #else bigtime_t time_interval; #endif /* *************** *** 535,541 **** * immediately. This is the same as the test for immediate grant in * LockAcquire, except we are only considering the part of the wait * queue before my insertion point. - * */ if (myHeldLocks != 0) { --- 533,538 ---- *************** *** 550,558 **** /* Must I wait for him ? */ if (lockctl->conflictTab[lockmode] & proc->heldLocks) { ! /* Yes, can report deadlock failure immediately */ ! MyProc->errType = STATUS_ERROR; ! return STATUS_ERROR; } /* I must go before this waiter. Check special case. */ if ((lockctl->conflictTab[lockmode] & aheadRequests) == 0 && --- 547,560 ---- /* Must I wait for him ? */ if (lockctl->conflictTab[lockmode] & proc->heldLocks) { ! /* ! * Yes, so we have a deadlock. Easiest way to clean up ! * correctly is to call RemoveFromWaitQueue(), but we ! * can't do that until we are *on* the wait queue. ! * So, set a flag to check below, and break out of loop. ! */ ! early_deadlock = true; ! break; } /* I must go before this waiter. Check special case. */ if ((lockctl->conflictTab[lockmode] & aheadRequests) == 0 && *************** *** 600,606 **** MyProc->waitHolder = holder; MyProc->waitLockMode = lockmode; ! MyProc->errType = STATUS_OK;/* initialize result for success */ /* mark that we are waiting for a lock */ waitingForLock = true; --- 602,620 ---- MyProc->waitHolder = holder; MyProc->waitLockMode = lockmode; ! MyProc->errType = STATUS_OK; /* initialize result for success */ ! ! /* ! * If we detected deadlock, give up without waiting. This must agree ! * with HandleDeadLock's recovery code, except that we shouldn't release ! * the semaphore since we haven't tried to lock it yet. ! */ ! if (early_deadlock) ! { ! RemoveFromWaitQueue(MyProc); ! MyProc->errType = STATUS_ERROR; ! return STATUS_ERROR; ! } /* mark that we are waiting for a lock */ waitingForLock = true;