Vlad Seryakov vlad@crystalballinc.com April, 2001 Security design for Web applications Preface When HTTP request comes the only information we have is request line with query parameters and/or cookies. These are examples of HTTP request: GET /index.htm HTTP/1.0 GET /cgi-bin/MyObjects/MyPortal.asp?account_id=123 There may be some additional headers that inform Web server about languages and encodings the browser supports, cookies and other information. All these headers except cookies are suplementary information, the resource is identified by first request line. For general purpose Web server which serves all kinds of requests including HTML files, CGI scripts, request line basically contains file path and name within Web server's filesystem. In order to prevent unauthorized control to some files or scripts, Web server has various security mechanisms. They restrict access to the whole resource by name or directory having no knowledge about this resource and resource's properties. When building Web application with access restrictions, or even simple application which requires autorized access, i had to build this security layer almost agian for every application. Mostly because different Web servers have different capabilities and API, even different Apache modules has different API, so when i programmed with Python i implemented this in Python, when i switched to PHP i had to use PHP functions and re-write authorization again. But every time i write this every time i do the same job. Is it possible to create security layer that will serve security part and will be almost independent from application itself? I think it is possible. The reason many applications have it own security implementation because their security implementation too tightly bound to application logic. Why not to define security functions, put it into separate API and implement it as much independently from language or Web server as possible. In this case application is the client to security sybsystem and should accept some rules or restrictions that this security subsytem introduces. The idea is not new, this is just slightly different approach for building Web application. Instead of creating application logic and then adding access restriction to it we can just use existing security implementation and just use its API. Our security model is based on naming convention for application requests. When you write your code, you should use this convention or API for building links between various parts of the application. Security layer is located between Web server and your application and takes these requests and applies them against it access database. Also we will use SQL database for storing sesison/user related information. All this will provide us with robust security system that can be used for many Web aplications. Implementation 1. Mechanism The idea is that every request to our secure application web site should be processed through security handler. This handler verifies user credentials and allows request pass or rejects it according to access priviliges this user has in our security database. Actually we have two tasks which are connected, but at the same time separated from each other. First, we need to authenticate user, in other words we have know to who is trying to access our restricted web site. And second, we need to determine the user's access rights to the requested resource. Both these tasks should use the same database, at least they should have access to the same user information. First task can be accomplished by using cookies or native HTTP autentication method. Second task is itself our security model implementation. We use AOLServer, Web server which America Online uses for its web sites. This is not just Web server, it is application server with embedded Tcl language and very powerful API. In AOLserver environment we will use filter feature whereby we install global handler and specify url pattern. ns_register_filter preauth GET /* security_filter ns_register_filter preauth POST /* security_filter For every request which matches this pattern the handler is called and verifies request path,parameters and/or cookies. It tries to check user against security permissions that are stored in database for each particular user or group of users. Such mechanism can be used not only in AOLServer environment, it is possible to configure virtual URL in Apache Web server using directive so that all requests that begin with this URL will be intercepted by the security handler which can be implemented in any available languages like Perl, Python, Tcl or PHP. 2. Data model Our security model is backed by SQL database. There is no doubt database is required no matter which vendor, it just should be true ACID database. ACID stands for Atomicty, Concurrency, Isolationand Durability. This means your operations within a transaction will be all completed or all rolled back, your database will provide concurrent access for more than one user where each user will feel like he is working with database alone, each user session with database will be separated from other user sessions and will not interfere directly with them, and all your commited transaction will live in database until you delete them even after database server or computer crash. All commercial databases conform to this principals, so we will not discuss which SQL server to use. Of course, instead of database, using some efforts it is possible to implement everything we are talking about here in flat files. But it will not be so reliable, portable and scalable in case of many users. Also we should think about managibility and support, including extension and improving software. SQL interface provides very good and well known level of abstration, vendors already spent many years building reliable database systems. Our goal is our application, not storage system. Our users table looks like this: CREATE TABLE users ( user_id INTEGER NOT NULL, user_name VARCHAR(32) NOT NULL, password VARCHAR(128) NOT NULL, salt VARCHAR(128) NOT NULL, session_id VARCHAR(128) NULL, login_time INTEGER DEFAULT 0 NOT NULL, access_time INTEGER DEFAULT 0 NOT NULL, ipaddr VARCHAR(16) NULL, PRIMARY KEY(user_id), UNIQUE(user_name) ); We show only required columns, this table may contain other useful information about user, such as first and last name, email address, address and other. user_id will contain unique identifier for each user. user_name will contain name a user will use in order to login into the system. password will contain encrypted user password. salt is special column which will be used for password generation and verification. session_id will contain unique id which identifies each user session. This is different from user_id, because session identifier will be different for any session even for the same user. It will be used for tracking and autorization user interaction with secure application. login_time will contain date and time a user last successful logon into the system. access_time will contain date and time of last user request for current session. This information will used in trackin and expiring user sessions after some time of inactivity. ipaddr will contain IP address for current session, it is set during login time, so even if you move browser's cookie file from one computer to another, this will not allow to continue the same session. The server will check IP addres from users table for this session with the IP address of current HTTP request. Because they are different, you will be prompted for login. Users may be combined into groups with the same set of access permissions. This is very convenient, it is possible to define set of permissions for each type of users and put these access rights into different group. When user is created, it may be linked to corresponding group, so there is no need to define the same permissions to each particular user. We have here some kind of inheritance, if there is no specific access permission for user, we will look into user's group(s). Our group table contains just required number of columns and can be extended to meet any specific needs. CREATE TABLE groups ( group_id INTEGER NOT NULL, group_name VARCHAR(64) NOT NULL, description VARCHAR(255) NULL, PRIMARY KEY(group_id), UNIQUE(group_name) ); group_id is unique group identifier. group_name is unique group name, this name will be shown in the applicatin for convenience. description is just text with group information. In order to include some particular user into the group we have to come up with another table which will contain links between users and groups table. This table represents many-to-many relation between users and groups, where any number of users can belong to any number of groups. CREATE TABLE user_groups ( user_id INTEGER NOT NULL REFERENCES users(user_id), group_id INTEGER NOT NULL REFERENCES groups(group_id), PRIMARY KEY(user_id,group_id) ); And finally we create access permission table which will contain access rights for each user or group. Any user can have many records in this database, we can define as many access record as needed. There is no technical limits, just logic of secuirty for each particular application. CREATE TABLE acls ( obj_id INTEGER NOT NULL, obj_type CHAR(1) NOT NULL CHECK(obj_type IN ('U','G')), project_name VARCHAR(64) NOT NULL, app_name VARCHAR(32) NOT NULL, app_context VARCHAR(64) NOT NULL, cmd_name VARCHAR(32) NOT NULL, cmd_context VARCHAR(32) NOT NULL, value CHAR(1) NOT NULL CHECK(value IN ('Y','N')), PRIMARY KEY(obj_id,project_name,app_name,app_context,cmd_name,cmd_context) ); obj_id is a column which contains id from users or groups tables only. In order to achieve this, it is possible to write trigger which will verify that inserted ID is id either users or groups table. obj_type determines type of object that is stored in obj_id column. value contains Y if access is granted to specified in this record application, or refued in case of N. Using N allows us to open access to the whole context and restrict access to specific parts within this context. Other columns reflect our security model which we disscuss later in this document. 3. Cookies Each user will have two cookies: user_id - unique permanent ID that identifies user session_id - temporary cookie that specifies particular session for each user_id. All security cookies are digitaly cryptographically signed, so tampering of cookies is not possible. Session_id for each user is also stored in the session table 'users' and periodically checked for expiration. user_id cookie is assigned after a user successfully logged in into the system. It is unique identifier which is primary key in the 'users' table. We can keep this cookie for a long time in the browser, because the same user always uses the same user_id. It is useful for example for public sites or public open part of secured sites. You can show some specific customized information for this particular user, like it is done on many Web portals. When a user provides his/her user name and password, we just look into the table and try to find record with provided user name. In case of success we verify provided password with existing one from database. We keep passwords encrypted in the database using MD5 or SHA1 digest encryption algorithms. So for verification we have to calculate digest from provided user password with the salt which is kept in the database. This salt is just some randomly calculated string, the more unique the salt is the more secure the system and the more difficult it to break. Resulting encrypted string should be the same as the encrypted password. To store password we do: - set salt with unique generated string - create encrypted password with result from MD5(user_password,salt) - store encrypted password and the salt into the table To verify user during logon: - find user in the database - create encrypted password by calling MD5(login_password,salt_from_database) - compare encrypted password with the value of password column If the password is correct, we generate unique session Id for this user, store it into the table and user's browser cookie and let this user in into the system. session_id cookie is assigned with every login and live for some configured period of time. Every time a user logs into the system, new session id is generated and assigned to a user using cookie mechanism. Supporting session and identifying user action within session is essential for interactive Web application. This is used for example in shopping carts for e-commerce sites. We store session id for each user in the same 'users' table. Each request which contains session_id cookie is verified with the ID of current session for each user. If the values are different, request just refused. Also session_id cookie is set to expire after some period of time. The server will re-new session cookie just before they are going to expire if user is working with the system. If user is away from his computer and after he returned and cookie is already expired, the system will prompt for the password in order to verify who is working. 3. HTTP authentication Cookies is not the only way the user can be authenticated on the Web. All browsers support internal HTTP authentication algorithm, whereby user credentials are sent in HTTP headers with every request. User is asked for user name and password only once, for the first time he/she accesses restricted web site and then the browser sends it automatically. Using this authentication method with plain HTTP protocol is dangerous because password is sent in clear text ( actually it base64 encoded, but it shouldn't be counted as an ecnryption). Combining this method with SSL gives pretty secure and simple way to authenticate users. And because every request comes with user name, it is very easy to locate the user in the database. 4. Access permissions The security filter will check access rights for application, context and command levels. Additional access restrictions may be applied inside application. The filter will check request line for valid "application","application_context","command","command context" tokens. To accomplish this we define that each request's path name will consist from 4 or 5 components: /project_name/app_name/app_context.oss[?cmd=command[.command_context]] or /project_name/app_name/app_context.oss[?cmd=command][&ctx=command_context] where: project_name is the directory, where all applications for this project is located. app_name is the directory where all pages for this application is located. Applications may be registered int the database. In this case security filter will refuse any requests with invalid/unknown application. app_context is any application-specific meaningful part of the application logic. We can register all possible contexts within an application or allow security admin to enter any contexts in permission database. In general application context is web page name: file name with html or dynamic server parsed page. It even can be virtual dynamic page whereby its contents are generated on the fly by web server. cmd is optional command within current context. The commands should be defined and security filter will refuse requests with unknown commands. Commmand may consist from two parts, command name itself and command execution context. This context defines additional logic layer inside current command operation. This is usefull when complex application context has the same command for more than one object within this is context. For example, command 'Move' may be applied to files, directories, documents and urls within one repository context. ctx is command context in different form, as a query parameter. It is used so for convenience, because sometimes for example in submit buttons there is impossible to use names that look like 'update.order' or 'update.account'. It is better to use command name as 'Update' and set command context using hidden form field. Example: /project/knowledgebase/dir.app /project/knowledgebase/file.app /project/knowledgebase/edit.app /project/knowledgebase/file.app?cmd=move /project/order/search.app /project/order/account.app?cmd=show /project/order/account.app?cmd=add.service /project/order/account.app?cmd=add&ctx=package Such separation of application logic is more or less complete but this scheme may be extended or customized according to specific application needs. When a request comes, security filter parses request path into tokens. We assume that request should conform to our naming convention. So, if we cannot parse it into our tokens, the request will be refused. After we got project, application, context, command and optionally command context tokens, we will scan access database for the match. Database access table contains all these 5 tokens as columns. Each column may contain actual value or *, which means 'everything'. Also we assume if there is no any cmd given, we set it to 'view', which is logically true, basically HTTP request is an attempt to view interested us document from web server. For optional command context token we assign default value 'unknown'. It is important to have such default values, it simplifies support and maintenance of access priviliges because there should not be any unexpected behaviour or unknown states in which the security system can be. We should know at any time what kind of request came and what kind of resources it requested. First three components of our naming convention are required, if there is any misspelling here, request will be refused because of parsing failure. The only two optional components are command and command context, where command is an action the user is trying to perform so it should be defined. When we set unspecified command to default view value, then we can setup access database with permissions where this command is allowed and where not. All records are sorted in a such way, that more specific contexts are always at the top and more general at the bottom. SQL statement that retrieves access permissions for the user with id 0 is: SELECT project_name, app_name, app_context, cmd_name, cmd_context, value FROM acls WHERE obj_id=0 AND obj_type='U' UNION SELECT project_name, app_name, app_context, cmd_name, cmd_context, value FROM acls,user_groups WHERE acls.obj_id=user_groups.group_id AND user_groups.user_id=0 AND acls.obj_type='G' ORDER BY 1 DESC,2 DESC,3 DESC,4 DESC,5 DESC,6 We retrieve permissions from 'acls' table for the specified user and all gropus this user belongs to. Sorting this way allows us to use the first match because more specific matches are always ahead of more general ones. We do not need to scan all records every time. Because it is possible to check access permissions inside application, we load all permissions into memory and call special routine that will scan this list for the match. This way we do not need to call SQL database every time we want to verify access to some parts of our application. We see below, that our user has access to 2 projects, 'portal' and 'doc'. Within project he has access to application called main. Within this application he can execute 'view' and 'search' for any pages. The page with user preferences 'prefs' can be updated by this user and any commands except 'delete' can be executed inside page/context 'apps'. obj_id | obj_type | project_name | app_name | app_context | cmd_name | cmd_context | value -----+----------+--------------+----------+-------------+----------+-------------+------- 0 | G | portal | main | prefs | update | * | Y 0 | G | portal | main | apps | * | * | Y 0 | G | portal | main | apps | delete | link | Y 0 | G | portal | main | apps | delete | * | N 0 | G | portal | main | * | view | * | Y 0 | G | portal | main | * | search | * | Y 0 | G | doc | * | * | view | * | Y Let's assume that we have request for the page /portal/main/apps/?cmd=view. Scanning our permission list we will find first match at second record from the top. First three tokens are exact match and cmd_name is '*' which means any commands. This record will give us 'Y' which means access is allowed to this resourse. If we will request for /portal/main/apps?cmd=delete we will match this request to the fourth record in the permission list which will give us 'N' as a result. Access is denied, because this user is not allowed to delete anything except links from the application page. In order to delete link from some application, the url should look like /portal/main/apps?cmd=delete.link. In this case search for permission record will stop at third record which gives us 'Y'. 6. Conslusion As it seen the security model flexible enough to provide access control for web applications. Further usage of this can include calling security routine for every menu item on the main web page before displaying it, so each user will see only allowed menu. Even if somebody will find out that there are some menu items that is not allowed to him and put in the browser such url, the security system will refuse it anyway, because it prevents unauthorized access completely, instead of hiding restricted urls by not including them into web page. This algorithm is useful for new applications, when you start building the system keeping in mind the structure of access control components and from the beginning define smallest part of the system which can be access individually. For existing applications it will require at least some work to convert all urls inside the source code and HTML files. But because of flexible and simple nature of this approach even existing applications can benefit from using this access control especially for constantly evolving sites with complex security policy. 6. References http://www.aol.com http://www.arsdigita.com http://www.aolserver.com http://www.apache.com http://www.oracle.com http://www.postgresql.com http://www.python.org http://www.scriptics.com http://www.perl.org http://www.php.net