Playing with Jupyter and Darktable db… which focal lens do I like more?

I am trying to embed a full Jupyter notebook here… it should be more or less auto-explicative. Please comment if you need help!

Ok, so now: what new nice prime lens should I buy?

 

explore_data

Exploring your DT database

Copy (for safety) your db:

cp $HOME/.config/darktable/library.db dtlibrary.db

These are the available tables:

    sqlite> .tables
    color_labels     history          mask             tagged_images  
    db_info          images           meta_data        used_tags      
    film_rolls       legacy_presets   selected_images

First of all, you load the database and save the tables you need in csv format. See for example http://www.sqlitetutorial.net/sqlite-tutorial/sqlite-export-csv/

    sqlite3 dtlibrary.db
    SQLite version 3.22.0 2018-01-22 18:45:57
    Enter ".help" for usage hints.
    sqlite> .headers on
    sqlite> .mode csv
    sqlite> .output images.csv
    sqlite> select * from images;
    sqlite> .quit
In [1]:
# sources
# https://songhuiming.github.io/pages/2017/04/02/jupyter-and-pandas-display/
#
import sys
import os
# BEWARE only for command line interactive usage
from math import *
import scipy as sp
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
pd.set_option('display.height', 10)
pd.set_option('display.max_rows', 8)
pd.set_option('display.max_columns', 10)
pd.set_option('display.width', 80)
In [2]:
allphotos = pd.read_csv("images.csv")
In [3]:
allphotos
Out[3]:
id group_id film_id width height write_timestamp history_end altitude position aspect_ratio
0 8244 8244 295 5344 3516 1440610685 13 NaN 35407710388224 0.0
1 8245 8244 295 5344 3516 1440610554 15 NaN 35412005355520 0.0
2 8246 8244 295 5344 3516 1440610590 15 NaN 35416300322816 0.0
3 8513 8513 304 4928 3280 1451331582 10 NaN 36563056590848 0.0
5653 17653 17654 564 5184 3888 1565097337 0 NaN 75797582839808 NaN
5654 17654 17654 564 5240 3912 1565097337 0 NaN 75801877807104 NaN
5655 17655 17656 564 5184 3888 1565097337 0 NaN 75806172774400 NaN
5656 17656 17656 564 5240 3912 1565097337 0 NaN 75810467741696 NaN

5657 rows × 42 columns

In [4]:
allphotos.columns
Out[4]:
Index(['id', 'group_id', 'film_id', 'width', 'height', 'filename', 'maker',
       'model', 'lens', 'exposure', 'aperture', 'iso', 'focal_length',
       'focus_distance', 'datetime_taken', 'flags', 'output_width',
       'output_height', 'crop', 'raw_parameters', 'raw_denoise_threshold',
       'raw_auto_bright_threshold', 'raw_black', 'raw_maximum', 'caption',
       'description', 'license', 'sha1sum', 'orientation', 'histogram',
       'lightmap', 'longitude', 'latitude', 'color_matrix', 'colorspace',
       'version', 'max_version', 'write_timestamp', 'history_end', 'altitude',
       'position', 'aspect_ratio'],
      dtype='object')

Fix the database

A lot of entries for OLYMPUS are bad. I fix this here, but you can fix the database also with

check: select * from images where maker like "%OLYMPUS%" and crop=0;
change: update images set crop=2 where maker like "%OLYMPUS%" and crop=0;
In [5]:
allphotos.loc[(allphotos.maker=="OLYMPUS CORPORATION")&(allphotos.crop==0), 'crop']=2
In [6]:
allphotos['FLeq35'] = allphotos.crop * allphotos.focal_length
In [7]:
FLseries = allphotos[allphotos.FLeq35>0].FLeq35
In [8]:
FLs=np.array(FLseries)
In [9]:
FLs
Out[9]:
array([  33.        ,   33.        ,   31.99999899, ...,  250.        ,
        140.        ,  140.        ])
In [10]:
plt.hist(FLs, bins =[0,20,40,60,80,120,200,500])
Out[10]:
(array([   21.,  3007.,   971.,   504.,   603.,   223.,   316.]),
 array([  0,  20,  40,  60,  80, 120, 200, 500]),
 <a list of 7 Patch objects>)

Let’s try to separate them based on camera

In [11]:
cam_and_fl = allphotos[allphotos.FLeq35>0].loc[:, ['model', 'FLeq35']]
In [12]:
cam_and_fl
Out[12]:
model FLeq35
6 SLT-A55V 33.000000
7 SLT-A55V 33.000000
9 DMC-LX5 31.999999
10 SLT-A55V 45.000000
5653 E-M1MarkII 250.000000
5654 E-M1MarkII 250.000000
5655 E-M1MarkII 140.000000
5656 E-M1MarkII 140.000000

5645 rows × 2 columns

In [13]:
models = cam_and_fl.model.unique(); models
Out[13]:
array(['SLT-A55V', 'DMC-LX5', 'DMC-TZ100', 'TG-870', 'E-M5MarkII',
       'E-M1MarkII', 'E-M10 Mark III'], dtype=object)
In [14]:
from matplotlib.ticker import ScalarFormatter
for i, mymod in enumerate(['SLT-A55V', 'DMC-LX5', 'DMC-TZ100', 
       'E-M1MarkII']):
    fig, axs = plt.subplots(nrows=1, figsize=(5,3.5))
    thisfls=cam_and_fl[cam_and_fl.model==mymod].FLeq35
    L=np.array(thisfls)
    axs.set_xscale("log")
    axs.xaxis.set_major_formatter(ScalarFormatter())
    axs.set_title(mymod)
    axs.set_xticks([15, 24, 35, 50, 70, 100, 140, 200, 300])
    axs.hist(L, bins =np.logspace(np.log10(10), np.log10(400), 20 ))

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

  

  

  

This site uses Akismet to reduce spam. Learn how your comment data is processed.