Using Django as a Data Tool in the
Enterprise
Trent Oliphant
Continuum Analytics
PyData NYC – November 10, 2015
© 2015 Continuum Analytics- Confidential & Proprietary
NOT ALL DATA IS BIG
© 2015 Continuum Analytics- Confidential & Proprietary
Enterprise Reporting
© 2015 Continuum Analytics- Confidential & Proprietary 3
Central Data
Store
Simple
Process
Clear
Results
Enterprise Reporting
© 2015 Continuum Analytics- Confidential & Proprietary 4
BC
BC
BC
BC
BC
Complex
Processes Results
Extra
Enterprise Reporting
• Aggregated
• Multiple Business Centers
• Various Size Centers
• Different Data
© 2015 Continuum Analytics- Confidential & Proprietary 5
Business Center Data
© 2015 Continuum Analytics- Confidential & Proprietary 6
Multiple
Data Sources
Multiple
Processes
Results
Corporate
Business Center Reporting
• Needs to feed upstream
• Have their own needs
• Smaller Teams
• Smaller Budgets
• Smaller Data
© 2015 Continuum Analytics- Confidential & Proprietary 7
OPEN SOURCE AS AN OPTION
© 2015 Continuum Analytics- Confidential & Proprietary
Advantages
• Cost
• Ease of use
• Community Resources
– Github
– Stack Overflow
– Anaconda.org
© 2015 Continuum Analytics- Confidential & Proprietary 9
Disadvantages
• Distribution and Installation
• Support
• Knowledge
– Lack of internal sharing
– No external sharing
© 2015 Continuum Analytics- Confidential & Proprietary 10
Anaconda Enterprise
• Package Deployment
• Collaboration
• Support
• Indemnification
© 2015 Continuum Analytics- Confidential & Proprietary 11
WHY DJANGO
© 2015 Continuum Analytics- Confidential & Proprietary
What is django?
• http://djangoproject.com
• Web framework
• Written in python
• Model-View-Template model
• v. 1.8 or higher
© 2015 Continuum Analytics- Confidential & Proprietary 13
Why Django?
• Easy Install and Setup
• Django ORM
• Built in Authentication
• Built in Admin Interface
• Talent Pool
© 2015 Continuum Analytics- Confidential & Proprietary 14
Easy Install and Setup
• Using Anaconda
– conda install django
– django-admin startproject myproj
• Built in Development web server
– python manage.py runserver
© 2015 Continuum Analytics- Confidential & Proprietary 15
Django ORM
• Create Models with fields
• DB Management Handled
• Work with Objects/Properties not SQL
• Can work with SQL directly
© 2015 Continuum Analytics- Confidential & Proprietary 16
Built in Authentication
• django.contrib.auth
• Basic Permissions
• Groups
• Sessions
© 2015 Continuum Analytics- Confidential & Proprietary 17
Built in Admin Interface
• django.contrib.admin
• Register model
• Basic data entry and editing
© 2015 Continuum Analytics- Confidential & Proprietary 18
Talent Pool
• Large Community
• Active Community
• Available Developers
© 2015 Continuum Analytics- Confidential & Proprietary 19
What about ______?
• SQLAlchemy
• Flask
• Turbo Gears
© 2015 Continuum Analytics- Confidential & Proprietary 20
PROJECT SETUP
© 2015 Continuum Analytics- Confidential & Proprietary
Requirements
• Automate forecasting
• Simple User Interface
• Regular Data Update
• Excel “integration”
© 2015 Continuum Analytics- Confidential & Proprietary 22
Team Structure
• Four Groups
– Modeling
– Finance
– Data
– Development
© 2015 Continuum Analytics- Confidential & Proprietary 23
Other influences
• Corporate Finance
• Corporate IT
• Internal Corporate Audit
• Regulations
© 2015 Continuum Analytics- Confidential & Proprietary 24
Tools used
• SAS
• Oracle
• TeraData
• Excel
• Python
© 2015 Continuum Analytics- Confidential & Proprietary 25
Environments
• Servers
– Production
– UAT (User Acceptance Testing)
– Development
• Workstations
© 2015 Continuum Analytics- Confidential & Proprietary 26
Workstations
• Desktop/Laptops
• Windows 7 Enterprise
• Locked down
© 2015 Continuum Analytics- Confidential & Proprietary 27
Servers
• Linux
• Apache
• Oracle
© 2015 Continuum Analytics- Confidential & Proprietary 28
Data
• Aggregated from TeraData
• 115 Tables (including output tables)
• Each run generates ~30 MB of data
• “Future” data becomes real each month
• New future data sets created
© 2015 Continuum Analytics- Confidential & Proprietary 29
SPECIFIC ISSUES
© 2015 Continuum Analytics- Confidential & Proprietary
Data Governance and Controls
• Authentication (Single Sign On)
• Access Control
• Data Validation
© 2015 Continuum Analytics- Confidential & Proprietary 31
Data Sharing
• Excel Files
– Multiple Copies
– Modifications
• Database
– Access Concerns
© 2015 Continuum Analytics- Confidential & Proprietary 32
Data Sharing
• Specialization
© 2015 Continuum Analytics- Confidential & Proprietary 33
Limited Machine Access
• No shell access
© 2015 Continuum Analytics- Confidential & Proprietary 34
SPECIFIC SOLUTIONS
© 2015 Continuum Analytics- Confidential & Proprietary
Request Flow
• Apache > SSO Agent > Django
• Request > Middleware > URL resolution >
View resolution > Template > Response
• Models can be used anywhere in the chain
© 2015 Continuum Analytics- Confidential & Proprietary 36
Integrating with Authentication
• Create Custom Authentication
• Create Middleware Class
• Update settings.py file to recognize
– AUTHENICATION_BACKENDS
– MIDDLEWARE_CLASSES
© 2015 Continuum Analytics- Confidential & Proprietary 37
Create Custom Authentication
class MyBackend(object):
def authenticate(self, username=None, password=None):
# Check the username/password and return a User.
return User.objects.get(username=username)
def get_user(self, user_id):
try:
return User.objects.get(pk=user_id)
except User.DoesNotExist:
return None
© 2015 Continuum Analytics- Confidential & Proprietary 38
class IntegratedBackend(object):
def authenticate(self, **credentials):
username = credentials.get('STANDARDID')
first_name = credentials.get('FIRSTNAME')
last_name = credentials.get('LASTNAME')
email = credentials.get('EMAIL')
try:
user = User.objects.get(username=username)
except User.DoesNotExist:
user = User(username=username,
password='Using external login',
first_name=first_name,
last_name=last_name,
email=email,
is_active=False)
user.save()
if not user.is_active:
user = None
return user
© 2015 Continuum Analytics- Confidential & Proprietary 39
Create Middleware Class
from django.contrib.auth import authenticate, login, logout
class SSOIntegrationMiddleware(object):
header_fields = ['STANDARDID','FIRSTNAME','LASTNAME','EMAIL']
def process_request(self, request):
headers = {x:request.META.get(x) for x in self.header_fields}
if not (request.user.username==request.META.get('STANDARDID')):
logout(request)
if not request.user.is_authenticated():
user = authenticate(**headers)
if user is not None:
login(request, user)
return None
© 2015 Continuum Analytics- Confidential & Proprietary 40
Update settings.py file
AUTHENTICATION_BACKENDS = (
'auth.IntegratedBackend',
'django.contrib.auth.backends.ModelBackend'
)
MIDDLEWARE_CLASSES = (
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
'middleware.SSOIntegrationMiddleware',
)
© 2015 Continuum Analytics- Confidential & Proprietary 41
Mocking Integration
• Create yaml file
• Create mock function
• Update Middleware Class
© 2015 Continuum Analytics- Confidential & Proprietary 42
Basic ssomock.yaml
active : trent
trent :
STANDARDID : 123456
FIRSTNAME : Trent
LASTNAME : Oliphant
EMAIL : trent.oliphant@continuum.io
bob :
STANDARDID : 987654
FIRSTNAME : Bob
LASTNAME : Rumsfield
EMAIL : bob@whitehouse.gov
© 2015 Continuum Analytics- Confidential & Proprietary 43
Create mock function
import yaml
def _get_mocked_headers(self):
headers = None
with open('ssomock.yaml','r') as f:
raw = yaml.load(f)
active = raw.get('active')
if active:
headers = raw.get(active)
return headers
© 2015 Continuum Analytics- Confidential & Proprietary 44
Update Middleware class
headers = {x:request.META.get(x) for x in self.header_fields}
if not request.META.get('STANDARDID'):
headers = self._get_mocked_headers()
request.META.update(headers)
else:
headers = {x:request.META.get(x) for x in self.header_fields}
© 2015 Continuum Analytics- Confidential & Proprietary 45
Access Control
• Normally at the model level
– delete, change, add
• Uses django_content_type table
• Needed it to be at a view (page) level
© 2015 Continuum Analytics- Confidential & Proprietary 46
Access Control
• Create Custom Content Type
• Custom model manager
• Create Custom Permission model
• Register admin interface
• Add decorator to views
© 2015 Continuum Analytics- Confidential & Proprietary 47
Create Content Type
• Insert into django_content_type table
– app_label = ‘ui’
– model = ‘uipermission’
• Through admin interface or direct to DB
© 2015 Continuum Analytics- Confidential & Proprietary 48
Custom Permission Manager
from django.db import Models
class UIPermissionManager(models.Manager):
def get_queryset(self):
return super(UIPermissionManager,
self).get_queryset().filter(
content_type__model='uipermission'
)
© 2015 Continuum Analytics- Confidential & Proprietary 49
Custom Permission Model
from django.contrib.auth.models import Permission
from django.contrib.contenttypes.model import ContentType
class UIPermission(Permission):
objects = UIPermissionManager()
class Meta:
proxy = True
verbose_name = 'ui_permission'
def save(self, *args, **kwargs):
ct, create = ContentType.objects.get_or_create(
model=self._meta.model_name,
app_label=self._meta.app_label,
)
self.content_type = ct
super(UIPermission, self).save(*args)
© 2015 Continuum Analytics- Confidential & Proprietary 50
Add permission to view
from django.contrib.auth.decorators import permission_required
@permission_required(‘permission_name’, login_url=‘/denied_page’)
def my_view(request):
…
© 2015 Continuum Analytics- Confidential & Proprietary 51
Accessing Output
• Output written to database
• Create excel files
– email
– Download
• Download CSV and log files
© 2015 Continuum Analytics- Confidential & Proprietary 52
Create Excel file
• Uses xlswriter
• Gets pandas dataframe from SQL query
• Each query written to own tab
© 2015 Continuum Analytics- Confidential & Proprietary 53
Download File
import os
from django.http import HttpResponse
from django.core.servers.basehttp import FileWrapper
def download_file(request):
filepath = 'Newly created file'
wrapper = FileWrapper(open(filepath, 'rb'))
response = HttpResponse(wrapper, content_type='application/force-download')
response['Content-Length'] = os.path.getsize(filepath)
filename = os.path.basename(filepath)
response['Content-Disposition'] = 'attachment; filename={}'.format(filename)
return response
© 2015 Continuum Analytics- Confidential & Proprietary 54
Uploading Data
• Simple form
• Tab names must match table/model names
• Column names must match
• Uses xlrd, pandas and cursor (not ORM)
© 2015 Continuum Analytics- Confidential & Proprietary 55
Uploading Data
import xlrd
import pandas as pd
from django.shortcuts import render
from django.db import connection, IntegityError, DatabaseError
def upload_data(request):
if request.method == 'POST':
workbook = self.open_workbook(request.FILES['uploaded_file'])
for sheetname in workbook.sheet_name:
# Do some error checking
df = pd.read_excel(workbook, sheetname, engine='xlrd')
cols = ', '.join(df.columns)
# Django wrapper of the cx_oracle connector expects %s format
val_holder = ', '.join(['%s'])*len(df.columns)
stmt_text = "INSERT INTO {} ({}) VALUES {()}"
stmt = stmt_text.format(sheetname, cols, val_holder)
cursor = connection.cursor()
cursor.executemany(stmt, df.values.to_list())
return render(request, 'upload.html’)
© 2015 Continuum Analytics- Confidential & Proprietary 56
Basic Admin Access
• __str__ representation of the object
• No data
from django.contrib import admin
from django.apps import apps
for model in apps.get_app_config('data').get_models():
admin.site.register(model)
© 2015 Continuum Analytics- Confidential & Proprietary 57
Tabular view
• Use list_display as property of class
• Needs a ModelAdmin class
class ExampleModelAdmin(admin.ModelAdmin):
list_display('field1','field2','field3')
admin.site.register(ExampleModel, ExampleModelAdmin)
© 2015 Continuum Analytics- Confidential & Proprietary 58
Tabular Admin View
for model in apps.get_app_config('data').get_models():
field_names = [f.name for f in model._meta.get_fields()
if f.concrete]
cls_nm = "{}_admin".format(model._meta.model_name)
options = {'list_display': field_names}
cls = type(cls_nm, (admin.ModelAdmin,), options)
admin.site.register(model, cls)
© 2015 Continuum Analytics- Confidential & Proprietary 59
Using a Different Oracle Schema
• Runs check_migrate
– Reads USER_TABLES
© 2015 Continuum Analytics- Confidential & Proprietary 60
Intercepting Django Logging
• Turn off default logging
– LOGGING_CONFIG = None
• Use ‘django’ as the name of logger
© 2015 Continuum Analytics- Confidential & Proprietary 61
Overriding SETTINGS
• settings.py is just a python file
• Read yaml file
• Update globals() with those from file
© 2015 Continuum Analytics- Confidential & Proprietary 62
Managed = False
• Different team deployed database schema
• No rights for Django to create schema
• manage.py sqlmigrate > output.sql
© 2015 Continuum Analytics- Confidential & Proprietary 63
Things to watch out for
• Meta options
– table_name
– Managed
• Database Error, IntegrityError
– Django wraps the underlying cx_oracle
© 2015 Continuum Analytics- Confidential & Proprietary 64

More Related Content

PPTX
Getting Into the Business Intelligence Game: Migrating OBIA to the Cloud
PDF
Obiee 12C and the Leap Forward in Lifecycle Management
PPTX
Scaling self service on Hadoop
PPTX
Music for a While, Neil Valentine Bournemouth Symphony Orchestra
DOCX
stat 2015 final (003)
DOCX
Integradora 2.pdf
PPTX
Gangehi Open Source Project
Getting Into the Business Intelligence Game: Migrating OBIA to the Cloud
Obiee 12C and the Leap Forward in Lifecycle Management
Scaling self service on Hadoop
Music for a While, Neil Valentine Bournemouth Symphony Orchestra
stat 2015 final (003)
Integradora 2.pdf
Gangehi Open Source Project

Viewers also liked (6)

PDF
Drink Informed; Resources for Staff and Patients about the Health Harms of A...
PDF
El peix irisat
PPTX
P. nayanto
PDF
How to setup a technology cooperation
DOCX
CV-HASHAAM16
PPTX
pacmaaaaaaaann
Drink Informed; Resources for Staff and Patients about the Health Harms of A...
El peix irisat
P. nayanto
How to setup a technology cooperation
CV-HASHAAM16
pacmaaaaaaaann
Ad

Similar to Django as a Data Tool in the Enterprise - PyData New York 2015 (20)

PDF
PyData Barcelona Keynote
PDF
Continuum Analytics and Python
PPTX
Highly configurable and extensible data processing framework at PubMatic
PDF
Bids talk 9.18
PPTX
CCT (Check and Calculate Transfer)
PPTX
Presentation CCT
PPTX
CCT Check and Calculate Transfer
PPTX
November 2013 HUG: Cyber Security with Hadoop
PDF
Python as the Zen of Data Science
PDF
Big data berlin
PDF
PLOTCON NYC: Interactive Visual Statistics on Massive Datasets
PDF
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
PDF
Does it only have to be ML + AI?
PPTX
Elastic Data Warehousing
PPTX
DOMAINS_ggdsgdsdsgdgdsggdssddsdsgdsgdsg.pptx
PPTX
Distributed Database Architecture for GDPR
PDF
Demystifying Data Warehouse as a Service (DWaaS)
PPTX
Python for Data Science with Anaconda
PDF
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
PDF
Dataweek Presentation from Chris Neumann
PyData Barcelona Keynote
Continuum Analytics and Python
Highly configurable and extensible data processing framework at PubMatic
Bids talk 9.18
CCT (Check and Calculate Transfer)
Presentation CCT
CCT Check and Calculate Transfer
November 2013 HUG: Cyber Security with Hadoop
Python as the Zen of Data Science
Big data berlin
PLOTCON NYC: Interactive Visual Statistics on Massive Datasets
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Does it only have to be ML + AI?
Elastic Data Warehousing
DOMAINS_ggdsgdsdsgdgdsggdssddsdsgdsgdsg.pptx
Distributed Database Architecture for GDPR
Demystifying Data Warehouse as a Service (DWaaS)
Python for Data Science with Anaconda
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Dataweek Presentation from Chris Neumann
Ad

Recently uploaded (20)

PPTX
Business_Capability_Map_Collection__pptx
PPT
Image processing and pattern recognition 2.ppt
PDF
Microsoft Core Cloud Services powerpoint
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PDF
Global Data and Analytics Market Outlook Report
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PPT
statistic analysis for study - data collection
PDF
Introduction to Data Science and Data Analysis
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PDF
Navigating the Thai Supplements Landscape.pdf
PDF
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
PPTX
Steganography Project Steganography Project .pptx
PPTX
Leprosy and NLEP programme community medicine
DOCX
Factor Analysis Word Document Presentation
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Business_Capability_Map_Collection__pptx
Image processing and pattern recognition 2.ppt
Microsoft Core Cloud Services powerpoint
[EN] Industrial Machine Downtime Prediction
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Global Data and Analytics Market Outlook Report
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
statistic analysis for study - data collection
Introduction to Data Science and Data Analysis
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Navigating the Thai Supplements Landscape.pdf
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
Steganography Project Steganography Project .pptx
Leprosy and NLEP programme community medicine
Factor Analysis Word Document Presentation
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja

Django as a Data Tool in the Enterprise - PyData New York 2015

  • 1. Using Django as a Data Tool in the Enterprise Trent Oliphant Continuum Analytics PyData NYC – November 10, 2015 © 2015 Continuum Analytics- Confidential & Proprietary
  • 2. NOT ALL DATA IS BIG © 2015 Continuum Analytics- Confidential & Proprietary
  • 3. Enterprise Reporting © 2015 Continuum Analytics- Confidential & Proprietary 3 Central Data Store Simple Process Clear Results
  • 4. Enterprise Reporting © 2015 Continuum Analytics- Confidential & Proprietary 4 BC BC BC BC BC Complex Processes Results Extra
  • 5. Enterprise Reporting • Aggregated • Multiple Business Centers • Various Size Centers • Different Data © 2015 Continuum Analytics- Confidential & Proprietary 5
  • 6. Business Center Data © 2015 Continuum Analytics- Confidential & Proprietary 6 Multiple Data Sources Multiple Processes Results Corporate
  • 7. Business Center Reporting • Needs to feed upstream • Have their own needs • Smaller Teams • Smaller Budgets • Smaller Data © 2015 Continuum Analytics- Confidential & Proprietary 7
  • 8. OPEN SOURCE AS AN OPTION © 2015 Continuum Analytics- Confidential & Proprietary
  • 9. Advantages • Cost • Ease of use • Community Resources – Github – Stack Overflow – Anaconda.org © 2015 Continuum Analytics- Confidential & Proprietary 9
  • 10. Disadvantages • Distribution and Installation • Support • Knowledge – Lack of internal sharing – No external sharing © 2015 Continuum Analytics- Confidential & Proprietary 10
  • 11. Anaconda Enterprise • Package Deployment • Collaboration • Support • Indemnification © 2015 Continuum Analytics- Confidential & Proprietary 11
  • 12. WHY DJANGO © 2015 Continuum Analytics- Confidential & Proprietary
  • 13. What is django? • http://djangoproject.com • Web framework • Written in python • Model-View-Template model • v. 1.8 or higher © 2015 Continuum Analytics- Confidential & Proprietary 13
  • 14. Why Django? • Easy Install and Setup • Django ORM • Built in Authentication • Built in Admin Interface • Talent Pool © 2015 Continuum Analytics- Confidential & Proprietary 14
  • 15. Easy Install and Setup • Using Anaconda – conda install django – django-admin startproject myproj • Built in Development web server – python manage.py runserver © 2015 Continuum Analytics- Confidential & Proprietary 15
  • 16. Django ORM • Create Models with fields • DB Management Handled • Work with Objects/Properties not SQL • Can work with SQL directly © 2015 Continuum Analytics- Confidential & Proprietary 16
  • 17. Built in Authentication • django.contrib.auth • Basic Permissions • Groups • Sessions © 2015 Continuum Analytics- Confidential & Proprietary 17
  • 18. Built in Admin Interface • django.contrib.admin • Register model • Basic data entry and editing © 2015 Continuum Analytics- Confidential & Proprietary 18
  • 19. Talent Pool • Large Community • Active Community • Available Developers © 2015 Continuum Analytics- Confidential & Proprietary 19
  • 20. What about ______? • SQLAlchemy • Flask • Turbo Gears © 2015 Continuum Analytics- Confidential & Proprietary 20
  • 21. PROJECT SETUP © 2015 Continuum Analytics- Confidential & Proprietary
  • 22. Requirements • Automate forecasting • Simple User Interface • Regular Data Update • Excel “integration” © 2015 Continuum Analytics- Confidential & Proprietary 22
  • 23. Team Structure • Four Groups – Modeling – Finance – Data – Development © 2015 Continuum Analytics- Confidential & Proprietary 23
  • 24. Other influences • Corporate Finance • Corporate IT • Internal Corporate Audit • Regulations © 2015 Continuum Analytics- Confidential & Proprietary 24
  • 25. Tools used • SAS • Oracle • TeraData • Excel • Python © 2015 Continuum Analytics- Confidential & Proprietary 25
  • 26. Environments • Servers – Production – UAT (User Acceptance Testing) – Development • Workstations © 2015 Continuum Analytics- Confidential & Proprietary 26
  • 27. Workstations • Desktop/Laptops • Windows 7 Enterprise • Locked down © 2015 Continuum Analytics- Confidential & Proprietary 27
  • 28. Servers • Linux • Apache • Oracle © 2015 Continuum Analytics- Confidential & Proprietary 28
  • 29. Data • Aggregated from TeraData • 115 Tables (including output tables) • Each run generates ~30 MB of data • “Future” data becomes real each month • New future data sets created © 2015 Continuum Analytics- Confidential & Proprietary 29
  • 30. SPECIFIC ISSUES © 2015 Continuum Analytics- Confidential & Proprietary
  • 31. Data Governance and Controls • Authentication (Single Sign On) • Access Control • Data Validation © 2015 Continuum Analytics- Confidential & Proprietary 31
  • 32. Data Sharing • Excel Files – Multiple Copies – Modifications • Database – Access Concerns © 2015 Continuum Analytics- Confidential & Proprietary 32
  • 33. Data Sharing • Specialization © 2015 Continuum Analytics- Confidential & Proprietary 33
  • 34. Limited Machine Access • No shell access © 2015 Continuum Analytics- Confidential & Proprietary 34
  • 35. SPECIFIC SOLUTIONS © 2015 Continuum Analytics- Confidential & Proprietary
  • 36. Request Flow • Apache > SSO Agent > Django • Request > Middleware > URL resolution > View resolution > Template > Response • Models can be used anywhere in the chain © 2015 Continuum Analytics- Confidential & Proprietary 36
  • 37. Integrating with Authentication • Create Custom Authentication • Create Middleware Class • Update settings.py file to recognize – AUTHENICATION_BACKENDS – MIDDLEWARE_CLASSES © 2015 Continuum Analytics- Confidential & Proprietary 37
  • 38. Create Custom Authentication class MyBackend(object): def authenticate(self, username=None, password=None): # Check the username/password and return a User. return User.objects.get(username=username) def get_user(self, user_id): try: return User.objects.get(pk=user_id) except User.DoesNotExist: return None © 2015 Continuum Analytics- Confidential & Proprietary 38
  • 39. class IntegratedBackend(object): def authenticate(self, **credentials): username = credentials.get('STANDARDID') first_name = credentials.get('FIRSTNAME') last_name = credentials.get('LASTNAME') email = credentials.get('EMAIL') try: user = User.objects.get(username=username) except User.DoesNotExist: user = User(username=username, password='Using external login', first_name=first_name, last_name=last_name, email=email, is_active=False) user.save() if not user.is_active: user = None return user © 2015 Continuum Analytics- Confidential & Proprietary 39
  • 40. Create Middleware Class from django.contrib.auth import authenticate, login, logout class SSOIntegrationMiddleware(object): header_fields = ['STANDARDID','FIRSTNAME','LASTNAME','EMAIL'] def process_request(self, request): headers = {x:request.META.get(x) for x in self.header_fields} if not (request.user.username==request.META.get('STANDARDID')): logout(request) if not request.user.is_authenticated(): user = authenticate(**headers) if user is not None: login(request, user) return None © 2015 Continuum Analytics- Confidential & Proprietary 40
  • 41. Update settings.py file AUTHENTICATION_BACKENDS = ( 'auth.IntegratedBackend', 'django.contrib.auth.backends.ModelBackend' ) MIDDLEWARE_CLASSES = ( 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.common.CommonMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.messages.middleware.MessageMiddleware', 'django.middleware.clickjacking.XFrameOptionsMiddleware', 'middleware.SSOIntegrationMiddleware', ) © 2015 Continuum Analytics- Confidential & Proprietary 41
  • 42. Mocking Integration • Create yaml file • Create mock function • Update Middleware Class © 2015 Continuum Analytics- Confidential & Proprietary 42
  • 43. Basic ssomock.yaml active : trent trent : STANDARDID : 123456 FIRSTNAME : Trent LASTNAME : Oliphant EMAIL : trent.oliphant@continuum.io bob : STANDARDID : 987654 FIRSTNAME : Bob LASTNAME : Rumsfield EMAIL : bob@whitehouse.gov © 2015 Continuum Analytics- Confidential & Proprietary 43
  • 44. Create mock function import yaml def _get_mocked_headers(self): headers = None with open('ssomock.yaml','r') as f: raw = yaml.load(f) active = raw.get('active') if active: headers = raw.get(active) return headers © 2015 Continuum Analytics- Confidential & Proprietary 44
  • 45. Update Middleware class headers = {x:request.META.get(x) for x in self.header_fields} if not request.META.get('STANDARDID'): headers = self._get_mocked_headers() request.META.update(headers) else: headers = {x:request.META.get(x) for x in self.header_fields} © 2015 Continuum Analytics- Confidential & Proprietary 45
  • 46. Access Control • Normally at the model level – delete, change, add • Uses django_content_type table • Needed it to be at a view (page) level © 2015 Continuum Analytics- Confidential & Proprietary 46
  • 47. Access Control • Create Custom Content Type • Custom model manager • Create Custom Permission model • Register admin interface • Add decorator to views © 2015 Continuum Analytics- Confidential & Proprietary 47
  • 48. Create Content Type • Insert into django_content_type table – app_label = ‘ui’ – model = ‘uipermission’ • Through admin interface or direct to DB © 2015 Continuum Analytics- Confidential & Proprietary 48
  • 49. Custom Permission Manager from django.db import Models class UIPermissionManager(models.Manager): def get_queryset(self): return super(UIPermissionManager, self).get_queryset().filter( content_type__model='uipermission' ) © 2015 Continuum Analytics- Confidential & Proprietary 49
  • 50. Custom Permission Model from django.contrib.auth.models import Permission from django.contrib.contenttypes.model import ContentType class UIPermission(Permission): objects = UIPermissionManager() class Meta: proxy = True verbose_name = 'ui_permission' def save(self, *args, **kwargs): ct, create = ContentType.objects.get_or_create( model=self._meta.model_name, app_label=self._meta.app_label, ) self.content_type = ct super(UIPermission, self).save(*args) © 2015 Continuum Analytics- Confidential & Proprietary 50
  • 51. Add permission to view from django.contrib.auth.decorators import permission_required @permission_required(‘permission_name’, login_url=‘/denied_page’) def my_view(request): … © 2015 Continuum Analytics- Confidential & Proprietary 51
  • 52. Accessing Output • Output written to database • Create excel files – email – Download • Download CSV and log files © 2015 Continuum Analytics- Confidential & Proprietary 52
  • 53. Create Excel file • Uses xlswriter • Gets pandas dataframe from SQL query • Each query written to own tab © 2015 Continuum Analytics- Confidential & Proprietary 53
  • 54. Download File import os from django.http import HttpResponse from django.core.servers.basehttp import FileWrapper def download_file(request): filepath = 'Newly created file' wrapper = FileWrapper(open(filepath, 'rb')) response = HttpResponse(wrapper, content_type='application/force-download') response['Content-Length'] = os.path.getsize(filepath) filename = os.path.basename(filepath) response['Content-Disposition'] = 'attachment; filename={}'.format(filename) return response © 2015 Continuum Analytics- Confidential & Proprietary 54
  • 55. Uploading Data • Simple form • Tab names must match table/model names • Column names must match • Uses xlrd, pandas and cursor (not ORM) © 2015 Continuum Analytics- Confidential & Proprietary 55
  • 56. Uploading Data import xlrd import pandas as pd from django.shortcuts import render from django.db import connection, IntegityError, DatabaseError def upload_data(request): if request.method == 'POST': workbook = self.open_workbook(request.FILES['uploaded_file']) for sheetname in workbook.sheet_name: # Do some error checking df = pd.read_excel(workbook, sheetname, engine='xlrd') cols = ', '.join(df.columns) # Django wrapper of the cx_oracle connector expects %s format val_holder = ', '.join(['%s'])*len(df.columns) stmt_text = "INSERT INTO {} ({}) VALUES {()}" stmt = stmt_text.format(sheetname, cols, val_holder) cursor = connection.cursor() cursor.executemany(stmt, df.values.to_list()) return render(request, 'upload.html’) © 2015 Continuum Analytics- Confidential & Proprietary 56
  • 57. Basic Admin Access • __str__ representation of the object • No data from django.contrib import admin from django.apps import apps for model in apps.get_app_config('data').get_models(): admin.site.register(model) © 2015 Continuum Analytics- Confidential & Proprietary 57
  • 58. Tabular view • Use list_display as property of class • Needs a ModelAdmin class class ExampleModelAdmin(admin.ModelAdmin): list_display('field1','field2','field3') admin.site.register(ExampleModel, ExampleModelAdmin) © 2015 Continuum Analytics- Confidential & Proprietary 58
  • 59. Tabular Admin View for model in apps.get_app_config('data').get_models(): field_names = [f.name for f in model._meta.get_fields() if f.concrete] cls_nm = "{}_admin".format(model._meta.model_name) options = {'list_display': field_names} cls = type(cls_nm, (admin.ModelAdmin,), options) admin.site.register(model, cls) © 2015 Continuum Analytics- Confidential & Proprietary 59
  • 60. Using a Different Oracle Schema • Runs check_migrate – Reads USER_TABLES © 2015 Continuum Analytics- Confidential & Proprietary 60
  • 61. Intercepting Django Logging • Turn off default logging – LOGGING_CONFIG = None • Use ‘django’ as the name of logger © 2015 Continuum Analytics- Confidential & Proprietary 61
  • 62. Overriding SETTINGS • settings.py is just a python file • Read yaml file • Update globals() with those from file © 2015 Continuum Analytics- Confidential & Proprietary 62
  • 63. Managed = False • Different team deployed database schema • No rights for Django to create schema • manage.py sqlmigrate > output.sql © 2015 Continuum Analytics- Confidential & Proprietary 63
  • 64. Things to watch out for • Meta options – table_name – Managed • Database Error, IntegrityError – Django wraps the underlying cx_oracle © 2015 Continuum Analytics- Confidential & Proprietary 64

Editor's Notes

  • #3: How many work in an t