I am trying to do a batch detection using the Darknet\YoloV4. It works for one batch, then the second batch fails with CUDA error. Am I missing something else on below snippet? And what are the right parameters for Batch for RTX GPU card, How to determine the right Batch size?
My system configuration is like below -
System: Host: ubox Kernel: 5.4.0-42-generic x86_64 bits: 64
Desktop: Gnome 3.28.4 Distro: Ubuntu 18.04.4 LTS
Machine: Device: desktop System: Alienware product: Alienware Aurora R9 v: 1.0.7 serial: N/A
Mobo: Alienware model: 0T76PD v: A01 serial: N/A
UEFI: Alienware v: 1.0.7 date: 12/23/2019
CPU: 8 core Intel Core i7-9700K (-MCP-) cache: 12288 KB
clock speeds: max: 4900 MHz 1: 800 MHz 2: 800 MHz 3: 800 MHz
4: 800 MHz 5: 801 MHz 6: 803 MHz 7: 808 MHz 8: 810 MHz
Graphics: Card-1: Intel Device 3e98
Card-2: NVIDIA Device 1e84
Display Server: x11 (X.Org 1.20.8 )
drivers: modesetting,nvidia (unloaded: fbdev,vesa,nouveau)
Resolution: 2560x1440#59.95hz, 1920x1080#60.00hz
OpenGL: renderer: GeForce RTX 2070 SUPER/PCIe/SSE2
version: 4.6.0 NVIDIA 450.57
I am getting CUDA Error: out of memory when I perform performBatchDetectV2() of batch size 3.
How to do Batching on Yolov4 architecture properly? In my use case, I am getting frames from the camera, I want to batch 10 frames into one and call below. The below function works perfectly if I call just once, meaning it throws Cuda error on the second batch of frames.
def performBatchDetectV2(image_list, thresh= 0.25, configPath = "./cfg/yolov4.cfg", weightPath = "yolov4.weights", metaPath= "./cfg/coco.data", hier_thresh=.5, nms=.45, batch_size=3):
net = load_net_custom(configPath.encode('utf-8'), weightPath.encode('utf-8'), 0, batch_size)
meta = load_meta(metaPath.encode('utf-8'))
pred_height, pred_width, c = image_list[0].shape
net_width, net_height = (network_width(net), network_height(net))
img_list = []
for custom_image_bgr in image_list:
custom_image = cv2.cvtColor(custom_image_bgr, cv2.COLOR_BGR2RGB)
custom_image = cv2.resize(
custom_image, (net_width, net_height), interpolation=cv2.INTER_NEAREST)
custom_image = custom_image.transpose(2, 0, 1)
img_list.append(custom_image)
arr = np.concatenate(img_list, axis=0)
arr = np.ascontiguousarray(arr.flat, dtype=np.float32) / 255.0
data = arr.ctypes.data_as(POINTER(c_float))
im = IMAGE(net_width, net_height, c, data)
batch_dets = network_predict_batch(net, im, batch_size, pred_width,
pred_height, thresh, hier_thresh, None, 0, 0)
batch_boxes = []
batch_scores = []
batch_classes = []
for b in range(batch_size):
num = batch_dets[b].num
dets = batch_dets[b].dets
if nms:
do_nms_obj(dets, num, meta.classes, nms)
boxes = []
scores = []
classes = []
for i in range(num):
det = dets[i]
score = -1
label = None
for c in range(det.classes):
p = det.prob[c]
if p > score:
score = p
label = c
if score > thresh:
box = det.bbox
left, top, right, bottom = map(int,(box.x - box.w / 2, box.y - box.h / 2,
box.x + box.w / 2, box.y + box.h / 2))
boxes.append((top, left, bottom, right))
scores.append(score)
classes.append(label)
# boxColor = (int(255 * (1 - (score ** 2))), int(255 * (score ** 2)), 0)
# cv2.rectangle(image_list[b], (left, top),
# (right, bottom), boxColor, 2)
# cv2.imwrite(os.path.basename(img_samples[b]),image_list[b])
batch_boxes.append(boxes)
batch_scores.append(scores)
batch_classes.append(classes)
free_batch_detections(batch_dets, batch_size)
return batch_boxes, batch_scores, batch_classes
The problem is, every time you did loaded the network when you call the performBatchDetectV2 method. So you have to load the network one time with constant batch size and use your loaded network for the prediction.
set your netwrok variable as global. Because you are
This is your function
def performBatchDetectV2(image_list, thresh= 0.25, configPath = "./cfg/yolov4.cfg", weightPath = "yolov4.weights", metaPath= "./cfg/coco.data", hier_thresh=.5, nms=.45, batch_size=3):
net = load_net_custom(configPath.encode('utf-8'), weightPath.encode('utf-8'), 0, batch_size)
meta = load_meta(metaPath.encode('utf-8'))
<your code>
Change to this
configPath = "./cfg/yolov4.cfg"
weightPath = "yolov4.weights"
metaPath= "./cfg/coco.data",
net = load_net_custom(configPath.encode('utf-8'), weightPath.encode('utf-8'), 0, batch_size)
meta = load_meta(metaPath.encode('utf-8'))
def performBatchDetectV2(image_list, thresh= 0.25, hier_thresh=.5, nms=.45, batch_size=3):
global net, meta
<your code>
Related
I am looking at some Fortran code from an old scanned paper. The scan quality is not great so I may have copied it wrong. I tried to run this using an online Fortran compiler but it bombs out. Not being familiar with Fortran, I was wondering if someone can point out where the syntax does not make sense? The code is from a paper on sediment dynamics:
Komar, P.D. and Miller, M.C., 1975. On the comparison between the threshold of sediment motion under waves and unidirectional currents with a discussion of the practical evaluation of the threshold: Reply. Journal of Sedimentary Research, 45(1).
PROGRAM TSHOLD
REAL LI, LO
G = 981.0
PIE = 3.1416
RHOW = 1.00
READ (6O,1) DIAM, RHOS
1 FORMAT (2X, F6.3,2X, F5.3)
IF(DIAM .LT. 0.05) GO TO 5
A = 0.463 * PIE
B = 0.25
GO TO 7
5 A = 0.21
B = 0.50
7 PWR = 1.0 / (2.0 - B)
FAC = (A * (RHOS - RHOW) * G/(RHOW * PIE**B))**PWR
FAC1 = FAC * DIAM**((1.0 - B) * PWR)
T = 1.0
15 J = 1.20
LD = 156.13 * (T**2)
UM = FAC1 * T**(B*PWR)
WRITE(61,9) DIAM, T, UM
9 FORMAT(1H0, 10X, 17HGRAIN DIAMETER = ,F6.3,1X,2HCM //
1 11X, 14HWAVE PERIOD = ,F5.2, 1X, 3HSEC //
2 11X, 22HORBITAL VELOCITY, UM = ,F6.2, 1X, 6HCM/SECl //
3 20X, 6HHEIGHT, 5X, 5HDEPTH, 8X, 3HH/L, 6X, 7HH/DEPTH //
4 22X, 2HCM, 8X, 2HCM /)
C INCREMENT WAVE HEIGHT, CALCULATE DEPTH
H = 10.0
DO 12 K = 1.60
SING = PIE * H / (UM * T)
X = SING
IF(X.LT.1.0) GO TO 30
30 ASINH = X - 0.16666*X**3.0 + 0.07500* X ** 5.0 - 0.04464 * X ** 7.0
1 + 0.03038 * X ** 9.0 - 0.02237 * X ** 11.0
32 LI = LD * (SINH(ASINH)/COSH(ASINH))
OPTH = ASINH * LI / 6.2832
C CHECK WAVE STABILITY
RATIO = H / DPTH
IF(RATIO.GE.0.78) GO TO 11
STEEP = H / LI
TEST = 0.142 * (SINH(ASINH)/COSH(ASINH))
IF(STEEP.GE.TEST) GO TO 11
WRITE(61,10) H, OPTH, STEEP, RATIO
I0 FORMAT(IH0, 20X, F5.1, 4X, E9.3, 4X, F5.3, 4X, F4.2)
11 H = H + 10.0
12 CONTINUE
T = T + 1.0
15 CONTINUE
END
The problem is more likely that old Fortran requires fixed form code formatting where the number of spaces before a statement is very important.
Here are some general rules
Normal statements start at column 7 and beyond
Lines cannot exceed 72 columns
Any character placed on column 6 indicates the line is a continuation from the line above. I see that on the code above in the lines following 9 FORMAT(..
A number placed between columns 1-5 indicates a label, which can be a target of a GO TO statement, a DO statement or a formatting specification.
The character C on the first column, and sometimes any character on the first column indicate the line is a comment line.
see https://people.cs.vt.edu/~asandu/Courses/MTU/CS2911/fortran_notes/node4.html for more info.
Based on the rules above, here is how to enter the code, with the correct spacing. I run the F77 code through a converter to make it compatible with F90 and F77 at the same time. The code below might compile with the online compiler now.
PROGRAM TSHOLD
REAL LI, LO
G = 981.0
PIE = 3.1416
RHOW = 1.00
READ (60,1) DIAM, RHOS
1 FORMAT (2X, F6.3,2X, F5.3)
IF(DIAM .LT. 0.05) GO TO 5
A = 0.463 * PIE
B = 0.25
GO TO 7
5 A = 0.21
B = 0.50
7 PWR = 1.0 / (2.0 - B)
FAC = (A * (RHOS - RHOW) * G/(RHOW * PIE**B))**PWR
FAC1 = FAC * DIAM**((1.0 - B) * PWR)
T = 1.0
DO 15 J=1,20
LD = 156.13 * (T**2)
UM = FAC1 * T**(B*PWR)
WRITE(61,9) DIAM, T, UM
9 FORMAT(1H0, 10X, 17HGRAIN DIAMETER = ,F6.3,1X,2HCM // &
& 11X, 14HWAVE PERIOD = ,F5.2, 1X, 3HSEC // &
& 11X, 22HORBITAL VELOCITY, UM = ,F6.2, 1X, 6HCM/SECl // &
& 20X, 6HHEIGHT, 5X, 5HDEPTH, 8X, 3HH/L, 6X, 7HH/DEPTH // &
& 22X, 2HCM, 8X, 2HCM /)
! INCREMENT WAVE HEIGHT, CALCULATE DEPTH
H = 10.0
DO 12 K = 1,60
SING = PIE * H / (UM * T)
X = SING
IF(X.LT.1.0) GO TO 30
30 ASINH = X - 0.16666*X**3.0 + 0.07500* X ** 5.0 - 0.04464 * X ** 7.&
& + 0.03038 * X ** 9.0 - 0.02237 * X ** 11.0
32 LI = LD * (SINH(ASINH)/COSH(ASINH))
OPTH = ASINH * LI / 6.2832
! CHECK WAVE STABILITY
RATIO = H / DPTH
IF(RATIO.GE.0.78) GO TO 11
STEEP = H / LI
TEST = 0.142 * (SINH(ASINH)/COSH(ASINH))
IF(STEEP.GE.TEST) GO TO 11
WRITE(61,10) H, OPTH, STEEP, RATIO
10 FORMAT(G14.4, 20X, F5.1, 4X, E9.3, 4X, F5.3, 4X, F4.2)
11 H = H + 10.0
12 CONTINUE
T = T + 1.0
15 CONTINUE
END
I found several transcription errors, replacing commas with dots, zeros with the letter O, and a missing DO statement.
I am using opencvs findContour to find the points to describe an image made up of lines (not polygons) as such:
cv::findContours(src, contours, hierarchy, cv::RETR_EXTERNAL, cv::CHAIN_APPROX_SIMPLE);.
If I understand correctly, the "cv2.connectedComponents" method gives what you are looking for. It assigns a label for each point in your image, the label is the same if points are connected. By doing this assignment there is no duplication happening. So, if your lines are one pixel wide (e.g output of an edge detector or a thinning operator) you get one point per location.
Edit:
As per the OP request, lines should be 1-pixel wide. To achieve this a thinning operation is applied before finding connected components. Steps images have been added too.
Please note that each connected component points are sorted in ascending order of y cords.
img_path = "D:/_temp/fig.png"
output_dir = 'D:/_temp/'
img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
_, img = cv2.threshold(img, 128, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)
total_white_pixels = cv2.countNonZero(img)
print ("Total White Pixels Before Thinning = ", total_white_pixels)
cv2.imwrite(output_dir + '1-thresholded.png', img)
#apply thinning -> each line is one-pixel wide
img = cv2.ximgproc.thinning(img)
cv2.imwrite(output_dir + '2-thinned.png', img)
total_white_pixels = cv2.countNonZero(img)
print ("Total White Pixels After Thinning = ", total_white_pixels)
no_ccs, labels = cv2.connectedComponents(img)
label_pnts_dic = {}
colored = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
i = 1 # skip label 0 as it corresponds to the backgground points
sum_of_cc_points = 0
while i < no_ccs:
label_pnts_dic[i] = np.where(labels == i) #where return tuple(list of x cords, list of y cords)
colored[label_pnts_dic[i]] = (random.randint(100, 255), random.randint(100, 255), random.randint(100, 255))
i +=1
cv2.imwrite(output_dir + '3-colored.png', colored)
print ("First ten points of label-1 cc: ")
for i in range(10):
print ("x: ", label_pnts_dic[1][1][i], "y: ", label_pnts_dic[1][0][i])
Output:
Total White Pixels Before Thinning = 6814
Total White Pixels After Thinning = 2065
First ten points of label-1 cc:
x: 312 y: 104
x: 313 y: 104
x: 314 y: 104
x: 315 y: 104
x: 316 y: 104
x: 317 y: 104
x: 318 y: 104
x: 319 y: 104
x: 320 y: 104
x: 321 y: 104
Images:
1.Thresholded
Thinned
Colored Components
Edit2:
After a discussion with OP, I understood that having a list of (scattered) points is not enough. Points should be ordered so that they could be traced. To achieve that new logic should be introduced after applying thinning to the image.
Find extreme points (points with a single 8-connectivity neighbor)
Find connector points (points with 3-ways connectivity)
Find simple points (all other points)
Start tracing from an extreme point until reaching another extreme point or a connector one.
Extract the traveled path.
Check whether a connector point has turned into a simple point and update its status.
Repeat
Check if there are any closed-loops of simple points that have not been reached from any extreme point, extract each closed-loop as an additional waypoint.
Code for extreme/connector/simple point classification
def filter_neighbors(ns):
i = 0
while i < len(ns):
j = i + 1
while j < len(ns):
if (ns[i][0] == ns[j][0] and abs(ns[i][1] - ns[j][1]) <= 1) or (ns[i][1] == ns[j][1] and abs(ns[i][0] - ns[j][0]) <= 1):
del ns[j]
break
j += 1
i += 1
def sort_points_types(pnts):
extremes = []
connections = []
simple = []
for i in range(pnts.shape[0]):
neighbors = []
for j in range (pnts.shape[0]):
if i == j: continue
if abs(pnts[i, 0] - pnts[j, 0]) <= 1 and abs(pnts[i, 1] - pnts[j, 1]) <= 1:#8-connectivity check
neighbors.append(pnts[j])
filter_neighbors(neighbors)
if len(neighbors) == 1:
extremes.append(pnts[i])
elif len(neighbors) == 2:
simple.append(pnts[i])
elif len(neighbors) > 2:
connections.append(pnts[i])
return extremes, connections, simple
img_path = "D:/_temp/fig.png"
output_dir = 'D:/_temp/'
img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
_, img = cv2.threshold(img, 128, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)
img = cv2.ximgproc.thinning(img)
pnts = cv2.findNonZero(img)
pnts = np.squeeze(pnts)
ext, conn, simple = sort_points_types(pnts)
for p in conn:
cv2.circle(img, (p[0], p[1]), 5, 128)
for p in ext:
cv2.circle(img, (p[0], p[1]), 5, 128)
cv2.imwrite(output_dir + "6-both.png", img)
print (len(ext), len(conn), len(simple))
Edit3:
A much more efficient implementation for classifying the points in a single pass by checking neighbors in a kernel-like way, thanks to eldesgraciado!
Note: Before calling this method the image should be padded with one pixel to avoid border checks or equivalently blackout pixels at the border.
def sort_points_types(pnts, img):
extremes = []
connections = []
simple = []
for p in pnts:
x = p[0]
y = p[1]
n = []
if img[y - 1,x] > 0: n.append((y-1, x))
if img[y - 1,x - 1] > 0: n.append((y-1, x - 1))
if img[y - 1,x + 1] > 0: n.append((y-1, x + 1))
if img[y,x - 1] > 0: n.append((y, x - 1))
if img[y,x + 1] > 0: n.append((y, x + 1))
if img[y + 1,x] > 0: n.append((y+1, x))
if img[y + 1,x - 1] > 0: n.append((y+1, x - 1))
if img[y + 1,x + 1] > 0: n.append((y+1, x + 1))
filter_neighbors(n)
if len(n) == 1:
extremes.append(p)
elif len(n) == 2:
simple.append(p)
elif len(n) > 2:
connections.append(p)
return extremes, connections, simple
An image visualizing extreme and connector points:
I have two different rates that the user can make for his teacher , i want to convert the total of each rate in pixels so i can have the progress bar effect, for example:
maximum_pixels = 100 #maximum width
services = 4.5 #width: 95px
professionalism = 5.0 #width: 100px
total_percentage = maximum_pixels * services / maximum_pixels
How can i implement that in my code ?
maxAllowed = 100
minAllowed = 0
unscaledNum = 3
_min = 0
_max = 5
((maxAllowed - minAllowed) * (unscaledNum - _min) / (_max - _min) + minAllowed)
Result:
60.0
This is the code we had in Swift 2. What is the Swift 3 version? I don't see a replacement for setShared.
let sharedCache: NSURLCache = NSURLCache(memoryCapacity: 0, diskCapacity: 0, diskPath: nil)
NSURLCache.setSharedURLCache(sharedCache)
This works in Xcode 8 Beta 4
URLCache.shared = sharedCache
Here is an Example in Swift 3 increasing cache size to 500 MB
let memoryCapacity = 500 * 1024 * 1024
let diskCapacity = 500 * 1024 * 1024
let cache = URLCache(memoryCapacity: memoryCapacity, diskCapacity: diskCapacity, diskPath: "myDataPath")
URLCache.shared = cache
It works for Xcode 8
URLCache.shared = {
URLCache(memoryCapacity: 0, diskCapacity: 0, diskPath: nil)
}()
I need to programatically get the refresh rate of a monitor.
When I type xrandr (1.4.1, opensuse 13) on the command line I get:
Screen 0: minimum 8 x 8, current 1920 x 1200, maximum 16384 x 16384
VGA-0 disconnected primary (normal left inverted right x axis y axis)
DVI-D-0 connected 1920x1200+0+0 (normal left inverted right x axis y axis) 518mm x 324mm
1920x1200 60.0*+
1920x1080 60.0
1680x1050 60.0
1600x1200 60.0
1280x1024 60.0
1280x960 60.0
1024x768 60.0
800x600 60.3
640x480 59.9
HDMI-0 disconnected (normal left inverted right x axis y axis)
This result is confirmed by nvidia-settings -q RefreshRate, among other things.
But ...
when I run the following code (origin: https://github.com/raboof/xrandr/blob/master/xrandr.c), compiled with g++ 4.8.1 (with -lX11 -lXext -lXrandr) :
int nsize;
int nrate;
short *rates;
XRRScreenSize *sizes;
Display *dpy = XOpenDisplay(NULL);
Window root = DefaultRootWindow(dpy);
XRRScreenConfiguration *conf = XRRGetScreenInfo(dpy, root);
printf ("Current rate: %d\n",XRRConfigCurrentRate(conf));
sizes = XRRConfigSizes(conf, &nsize);
printf(" SZ: Pixels Refresh\n");
for (int i = 0; i < nsize; i++) {
printf("%-2d %5d x %-5d", i, sizes[i].width, sizes[i].height);
rates = XRRConfigRates(conf, i, &nrate);
if (nrate)
printf(" ");
for (int j = 0; j < nrate; j++)
printf("%-4d", rates[j]);
printf("\n");
}
XRRFreeScreenConfigInfo(conf);
I get:
Current rate: 50
SZ: Pixels Refresh
0 1920 x 1200 50
1 1920 x 1080 51
2 1680 x 1050 52
3 1600 x 1200 53
4 1280 x 1024 54
5 1280 x 960 55
6 1024 x 768 56
7 800 x 600 57
8 640 x 480 58
9 1440 x 900 59
10 1366 x 768 60
11 1280 x 800 61
12 1280 x 720 62
Why am I getting this result?
What I am doing wrong?
The software uses OpenGL with GLEW. can this have any influence?
We do call glXQueryDrawable(dpy, drawable, GLX_SWAP_INTERVAL_EXT, &val) but afterwards, and I do not think this should have any influence.
I found the answer:
If the XRandR sever supports version 1.2 of the protocol, then the appropriate functions need to be used (wich I plan to do by copying snippets of code from https://github.com/raboof/xrandr/blob/master/xrandr.c where has_1_2 is true).
My code in the question uses functions for the version 1.1 of the protocol, and therefore only the metamodes are returned.
As a simple check, I tried the following two commands:
xrandr --q1
xrandr --q12.
And indeed the 1st one gives me the same result I programatically get.
Credits go to http://www.ogre3d.org/forums/viewtopic.php?f=4&t=65010&start=200