FontForge: How to scale a glyph - python-2.7

I have been able to batch import svg at their appropriate places, through the python scripts that come with the fontforge. Now I want to scale up each glyph let's say 200%. I am able to do that in fontforge through the
Elements->Tranformations->Transform->choosing scale uniformly from
the dropdown and entering 200%.
How can I do the same from the functions provided in the python library, I couldn't find it.

Well Found it myself!
import fontforge
import psMat
SFD_FONT = fontforge.open("DejaVuMono.sfd")
INDEX = 105
SCALE_MATRIX = psMat.scale(0.50)
print SFD_FONT[INDEX].foreground[0].boundingBox()
SFD_FONT[INDEX].foreground[0].transform(SCALE_MATRIX)
print SFD_FONT[INDEX].foreground[0].boundingBox()

Related

Export Weka Fiji plugin segmentation as ROI

I am using Imagej/Fiji Weka plugin for classifying different regions of the image and I would like to obtain those regions for further processing… (as Rois? a table with the classes?)
So in the end I want to get the area for each of the segmented classes.. I need to script this task as it is part of a pipeline… But not sure how to do extract the results from Weka Imagej Plugin…
Actually this code example is just to illustrate the idea.. Anyone has a basic example on applying a classifier and exporting results?
Thank you in advance
import trainableSegmentation.*;
import ij.IJ;
import ij.ImagePlus;
// input train image
input = IJ.openImage( someImage.jpg" );
// create Weka Segmentation object
segmentator = new WekaSegmentation( input );
// load classifier from file
segmentator.loadClassifier( "classifier.model" );
result = segmentator.applyClassifier( input );
...
You have a couple of answers about that in the image.sc forum: https://forum.image.sc/t/measuring-area-fraction-in-a-weka-classifed-image/7979
https://forum.image.sc/t/quantifing-weka-output/4660/6
That being said, you can also directly use the resulting label image for measurement within other plugins, such as the ones from MorphoLibJ (after installation, under Plugins > MorphoLibJ > Analyze).

Choropleth in Folium - having issues

I am quite new to coding and need to create some type of map to display my data.
I have data in an excel sheet showing a list of countries and a number showing amount of certain crimes. I want to show this in a choropleth map.
I have seen lots of different ways to code this but can't seem to get any to correctly read in the data. Will I need to get import country codes into my df?
I have my world map from github and downloaded it in its raw format to my computer and into jupyter notebook.
My dataframe is also loaded, as the excel sheet, in jupyter notebook.
What are the first steps i need to take to load this into a map?
This is the code I have had the most success with:
import pandas as pd
import folium
df = pd.read_excel('UK - Nationality and Type.xlsx')
state_geo = 'countries.json'
m1 = folium.Map(location=[55, 4], zoom_start=3)
m1.choropleth(
geo_data=state_geo,
data=df,
columns=['Claimed Nationality', 'Labour Exploitation'],
key_on='feature.id',
fill_color='YlGn',
fill_opacity=0.5,
line_opacity=0.2,
legend_name='h',
highlight=True
)
m1.save("my_map.html")`
But i just get a big world map, all in the same shade of grey
this is what the countries.json looks like

bokeh - plotting shapefile map using datashader

Initially, I created an interactive map of the UK Postcode area where an individual area is color represented based on its value (e.g. population in that post code area) as following.
from bokeh.plotting import figure
from bokeh.palettes import Viridis256 as palette
from bokeh.models import LinearColorMapper
from bokeh.models import ColumnDataSource
import geopandas as gpd
shp = 'file_path_to_the_downloaded_shapefile'
#read shape file into dataframe using geopandas
df = gpd.read_file(shp)
def expandMultiPolygons(row, geometry):
if row[geometry].type = 'MultiPolygon':
row[geometry] = [p for p in row[geometry]]
return row
#Some rows were in MultiPolygons instead of Polygons.
#Expand MultiPolygons to multi rows of Polygons
df = df.apply(expandMultiPolygons, geometry='geometry', axis=1)
df = df.set_index('Area')['geometry'].apply(pd.Series).stack().reset_index()
#Visualize the polygons. To visualize different colors for different post areas, I added another column called 'value' which has some random integer value.
p = figure()
color_mapper = LinearColorMapper(palette=palette)
source = ColumnDataSource(df)
p.patches('x', 'y', source=source,\
fill_color={'field': 'value', 'transform': color_mapper},\
fill_alpha=1.0, line_color="black", line_width=0.05)
where df is a dataframe of four columns : post code area, x-coordinate, y-coordinate, value (i.e. population).
The above code creates an interactive map on a web browser which is great but I noticed the interactivity is not very smooth in speed. If I zoom in or move the map, it renders slowly. The size of the dataframe is only 1106 rows, so I'm quite confused why it is so slow.
As one of the possible solutions, I came across with datashader (https://datashader.readthedocs.io/en/latest/) but I find the example script is quite complicated and most of them are with holoview package on Jupyter notebook but I want to create a dashboard using bokeh.
Does anyone advise me in incorporating datashader into the above bokeh script? Do I need a different function within datashader to create the shape map instead of using bokeh's patches function?
Any suggestion would be highly appreciated!!!
Without the data file involved, I can't answer your question directly, but can offer some observations:
Datashader is unlikely to be of value for this purpose, because datashader does not currently have any support for rendering polygons. As a rule of thumb, Datashader is designed to aggregate your data, and if it's already aggregated, Datashader won't normally be of help. Here your data is aggregated by postcode, which datashader can't process, but if you had the original data per person it would be happy to render it.
If you prefer working with Bokeh directly rather than via the higher-level HoloViews/GeoViews interface, I'd recommend folllwing Matt Rocklin's work on accelerating geopandas; his approach should be very fast for your purpose.
All that said, HoloViews, and GeoViews should be a convenient way to work with Bokeh in general, whether or not you want to create a dashboard. E.g. the 2017 JupyterCon tutorial shows how to make a simple Bokeh dashboard using both libraries. It doesn't cover shape files, but those are covered in other GeoViews examples.
As mentioned in my comment, I believe that the complexity of your polygons might cause your problem. The file you linked to contains several shapefile of different sizes and complexities. You can simplify those, i.e. reduce the number of points for each polygon. This can change how they look. It can range from almost no difference over a bit more "edginess" to an angular appearance. This depends on the level of simplification you chose. Depending on your needs you can chose different levels of simplicity.
I know of three easy options to get this done:
GUI: Try QGis. It is a great opensource tool for geospatial data processing. Load your Shapefile as a new layer. Then use the "Simplify Geometries" tool under the Vector menu.
Command-Line: GDAL is an open-source library. It comes with an useful command-line tool. You can use it like this: ogr2ogr outfile.shp infile.shp -simplify 0.000001
Online: Visit mapshader. Import your file. Select simplify and chose your level. Then, export the result. What I really like here is that your file is rendered instantly. Hence, you can immediately see the result of your simplification.
Other than that, you should also update your bokeh version. It gets updated regularly and there have been some performance improvements since.
Using HoloViews or GeoViews will not positively affect your performance. Thus, it is not related to your issues. I guess #James A. Bednar was just giving some side advice there.
I found a way to speed up the interactive visualization of the UK map as I move the slider.
I created individual image (in 2D) for a different value of slider first and updated the map using the 2D images instead of using bokeh's patches function.
Since the images are in array format, it is much faster to update the image while changing the values in the slider. one downside in this method is that I can no longer use hover function on the UK map.
I referred to the following url to convert polygon information into arrays: https://gist.github.com/brendancol/db030013e981c46acb2886060dde607e#file-rasterio_datashader_polygons-py-L35

Number of examples in each tfrecord

Running the sample.sh script in Google Cloud Shell to call the below preprocess on set of images following the steps of flowers example.
https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/flowers/trainer/preprocess.py
Preprocess was successfully on both eval set and train set. But the generated .tfrecord.gz files does not seem matching the image numbers in eval/train_set.csv.
i.e. eval-00000-of-00157.tfrecord.gz says there are 158 tfrecord while there are 35227 rows in eval_set.csv. Each record include a valid image_url (all of them are uploaded to Storage), each record has valid label tagged.
Would like to know if there is a way to monitor and control the number of images per tfrecord in preproces.py config.
Thanks
Update, got this work out right:
import tensorflow as tf
import os
from tensorflow.python.lib.io import file_io
options = tf.python_io.TFRecordOptions(
compression_type=tf.python_io.TFRecordCompressionType.GZIP)
sum(1 for f in file_io.get_matching_files(os.path.join(url/path, '*.tfrecord.gz'))
for example in tf.python_io.tf_record_iterator(f, options=options))
The filename eval-00000-of-00157.tfrecord.gz means that this is the first file out of 158. There should be 157 similarly named files. Within each file, there can be any number of records.
If you want to manually count each record, try something like:
import tensorflow as tf
from tensorflow.python.lib.io import file_io
files = os.path.join('gs://my_bucket/my_dir', 'eval-*.tfrecord.gz')
print(sum(1 for f in tf.python_io.file_io.get_matching_files(files)
for tf.python_io.tf_record_iterator(f)))
Note that there is no guarantee from Dataflow as to the relationship between the number of files and ordering of records (inter- and intra-file) between input files and output files. However, the counts should be the same.

Monitor training/validation process in Caffe

I'm training Caffe Reference Model for classifying images.
My work requires me to monitor the training process by drawing graph of accuracy of the model after every 1000 iterations on entire training set and validation set which has 100K and 50K images respectively.
Right now, Im taking the naive approach, make snapshots after every 1000 iterations, run the C++ classififcation code which reads raw JPEG image and forward to the net and output the predicted labels. However, this takes too much time on my machine (with a Geforce GTX 560 Ti)
Is there any faster way that I can do to have the graph of accuracy of the snapshot models on both training and validation sets?
I was thinking about using LMDB format instead of raw images. However, I cannot find documentation/code about doing classification in C++ using LMDB format.
1) You can use the NVIDIA-DIGITS app to monitor your networks. They provide a GUI including dataset preparation, model selection, and learning curve visualization. More, they use a caffe distribution allowing multi-GPU training.
2) Or, you can simply use the log-parser inside caffe.
/pathtocaffe/build/tools/caffe train --solver=solver.prototxt 2>&1 | tee lenet_train.log
This allows you to save train log into "lenet_train.log". Then by using:
python /pathtocaffe/tools/extra/parse_log.py lenet_train.log .
you parse your train log into two csv files, containing train and test loss. You can then plot them using the following python script
import pandas as pd
from matplotlib import *
from matplotlib.pyplot import *
train_log = pd.read_csv("./lenet_train.log.train")
test_log = pd.read_csv("./lenet_train.log.test")
_, ax1 = subplots(figsize=(15, 10))
ax2 = ax1.twinx()
ax1.plot(train_log["NumIters"], train_log["loss"], alpha=0.4)
ax1.plot(test_log["NumIters"], test_log["loss"], 'g')
ax2.plot(test_log["NumIters"], test_log["acc"], 'r')
ax1.set_xlabel('iteration')
ax1.set_ylabel('train loss')
ax2.set_ylabel('test accuracy')
savefig("./train_test_image.png") #save image as png
Caffe creates logs each time you try to train something, and its located in the tmp folder (both linux and windows).
I also wrote a plotting script in python which you can easily use to visualize your loss/accuracy.
Just place your training logs with .log extension next to the script and double click on it.
You can use command prompts as well, but for ease of use, when executed it loads all logs (*.log) it can find in the current directory.
it also shows the top 4 accuracies and at-which accuracy they were achieved.
you can find it here : https://gist.github.com/Coderx7/03f46cb24dcf4127d6fa66d08126fa3b
python /pathtocaffe/tools/extra/parse_log.py lenet_train.log
command produces the following error:
usage: parse_log.py [-h] [--verbose] [--delimiter DELIMITER]
logfile_path output_dir
parse_log.py: error: too few arguments
Solution:
For successful execution of "parse_log.py" command, we should pass the two arguments:
log file
path of output directory
So the correct command is as follows:
python /pathtocaffe/tools/extra/parse_log.py lenet_train.log output_dir