# cgi.rb - cgi support library
# Copyright (C) 2000 Network Applied Communication Laboratory, Inc.
# Copyright (C) 2000 Information-technology Promotion Agency, Japan
# Author: Wakou Aoyama <wakou@ruby-lang.org>
# Documentation: Wakou Aoyama (RDoc'd and embellished by William Webber)
# The Common Gateway Interface (CGI) is a simple protocol
# for passing an HTTP request from a web server to a
# standalone program, and returning the output to the web
# browser. Basically, a CGI program is called with the
# parameters of the request passed in either in the
# environment (GET) or via $stdin (POST), and everything
# it prints to $stdout is returned to the client.
# This file holds the +CGI+ class. This class provides
# functionality for retrieving HTTP request parameters,
# managing cookies, and generating HTML output. See the
# class documentation for more details and examples of use.
# The file cgi/session.rb provides session management
# functionality; see that file for more details.
# See http://www.w3.org/CGI/ for more information on the CGI
raise "Please, use ruby 1.5.4 or later." if RUBY_VERSION < "1.5.4"
# CGI class. See documentation for the file cgi.rb for an overview
# CGI is a large class, providing several categories of methods, many of which
# are mixed in from other modules. Some of the documentation is in this class,
# some in the modules CGI::QueryExtension and CGI::HtmlExtension. See
# CGI::Cookie for specific information on handling cookies, and cgi/session.rb
# (CGI::Session) for information on sessions.
# For queries, CGI provides methods to get at environmental variables,
# parameters, cookies, and multipart request data. For responses, CGI provides
# methods for writing output and generating HTML.
# Read on for more details. Examples are provided at the bottom.
# The CGI class dynamically mixes in parameter and cookie-parsing
# functionality, environmental variable access, and support for
# parsing multipart requests (including uploaded files) from the
# CGI::QueryExtension module.
# === Environmental Variables
# The standard CGI environmental variables are available as read-only
# attributes of a CGI object. The following is a list of these variables:
# AUTH_TYPE HTTP_HOST REMOTE_IDENT
# CONTENT_LENGTH HTTP_NEGOTIATE REMOTE_USER
# CONTENT_TYPE HTTP_PRAGMA REQUEST_METHOD
# GATEWAY_INTERFACE HTTP_REFERER SCRIPT_NAME
# HTTP_ACCEPT HTTP_USER_AGENT SERVER_NAME
# HTTP_ACCEPT_CHARSET PATH_INFO SERVER_PORT
# HTTP_ACCEPT_ENCODING PATH_TRANSLATED SERVER_PROTOCOL
# HTTP_ACCEPT_LANGUAGE QUERY_STRING SERVER_SOFTWARE
# HTTP_CACHE_CONTROL REMOTE_ADDR
# For each of these variables, there is a corresponding attribute with the
# same name, except all lower case and without a preceding HTTP_.
# +content_length+ and +server_port+ are integers; the rest are strings.
# The method #params() returns a hash of all parameters in the request as
# name/value-list pairs, where the value-list is an Array of one or more
# values. The CGI object itself also behaves as a hash of parameter names
# to values, but only returns a single value (as a String) for each
# For instance, suppose the request contains the parameter
# "favourite_colours" with the multiple values "blue" and "green". The
# following behaviour would occur:
# cgi.params["favourite_colours"] # => ["blue", "green"]
# cgi["favourite_colours"] # => "blue"
# If a parameter does not exist, the former method will return an empty
# array, the latter an empty string. The simplest way to test for existence
# of a parameter is by the #has_key? method.
# HTTP Cookies are automatically parsed from the request. They are available
# from the #cookies() accessor, which returns a hash from cookie name to
# If a request's method is POST and its content type is multipart/form-data,
# then it may contain uploaded files. These are stored by the QueryExtension
# module in the parameters of the request. The parameter name is the name
# attribute of the file input field, as usual. However, the value is not
# a string, but an IO object, either an IOString for small files, or a
# Tempfile for larger ones. This object also has the additional singleton
# #local_path():: the path of the uploaded file on the local filesystem
# #original_filename():: the name of the file on the client computer
# #content_type():: the content type of the file
# The CGI class provides methods for sending header and content output to
# the HTTP client, and mixes in methods for programmatic HTML generation
# from CGI::HtmlExtension and CGI::TagMaker modules. The precise version of HTML
# to use for HTML generation is specified at object creation time.
# The simplest way to send output to the HTTP client is using the #out() method.
# This takes the HTTP headers as a hash parameter, and the body content
# via a block. The headers can be generated as a string using the #header()
# method. The output stream can be written directly to using the #print()
# Each HTML element has a corresponding method for generating that
# element as a String. The name of this method is the same as that
# of the element, all lowercase. The attributes of the element are
# passed in as a hash, and the body as a no-argument block that evaluates
# to a String. The HTML generation module knows which elements are
# always empty, and silently drops any passed-in body. It also knows
# which elements require matching closing tags and which don't. However,
# it does not know what attributes are legal for which elements.
# There are also some additional HTML generation methods mixed in from
# the CGI::HtmlExtension module. These include individual methods for the
# different types of form inputs, and methods for elements that commonly
# take particular attributes where the attributes can be directly specified
# as arguments, rather than via a hash.
# value = cgi['field_name'] # <== value string for 'field_name'
# # if not 'field_name' included, then return "".
# fields = cgi.keys # <== array of field names
# # returns true if form has 'field_name'
# cgi.has_key?('field_name')
# cgi.has_key?('field_name')
# cgi.include?('field_name')
# CAUTION! cgi['field_name'] returned an Array with the old
# cgi.rb(included in ruby 1.6)
# === Get form values as hash
# cgi.params['new_field_name'] = ["value"] # add new param
# cgi.params['field_name'] = ["new_value"] # change value
# cgi.params.delete('field_name') # delete param
# cgi.params.clear # delete all params
# === Save form values to file
# db = PStore.new("query.db")
# db["params"] = cgi.params
# === Restore form values from file
# db = PStore.new("query.db")
# cgi.params = db["params"]
# === Get multipart form values
# value = cgi['field_name'] # <== value string for 'field_name'
# value.read # <== body of value
# value.local_path # <== path to local file of value
# value.original_filename # <== original filename of value
# value.content_type # <== content_type of value
# and value has StringIO or Tempfile class methods.
# values = cgi.cookies['name'] # <== array of 'name'
# # if not 'name' included, then return [].
# names = cgi.cookies.keys # <== array of cookie names
# and cgi.cookies is a hash.
# for name, cookie in cgi.cookies
# cookie.expires = Time.now + 30
# cgi.out("cookie" => cgi.cookies) {"string"}
# cgi.cookies # { "name1" => cookie1, "name2" => cookie2, ... }
# cgi.cookies['name'].expires = Time.now + 30
# cgi.out("cookie" => cgi.cookies['name']) {"string"}
# === Print http header and html string to $DEFAULT_OUTPUT ($>)
# cgi = CGI.new("html3") # add HTML generation methods
# cgi.head{ cgi.title{"TITLE"} } +
# cgi.textarea("get_text") +
# "params: " + cgi.params.inspect + "\n" +
# "cookies: " + cgi.cookies.inspect + "\n" +
# ENV.collect() do |key, value|
# key + " --> " + value + "\n"
# # add HTML generation methods
# CGI.new("html3") # html3.2
# CGI.new("html4") # html4.01 (Strict)
# CGI.new("html4Tr") # html4.01 Transitional
# CGI.new("html4Fr") # html4.01 Frameset
# String for carriage return
# Standard internet newline sequence
REVISION = '$Id: cgi.rb 26086 2009-12-14 02:40:07Z shyouhei $' #:nodoc:
NEEDS_BINMODE = true if /WIN/ni.match(RUBY_PLATFORM)
# Path separators in different environments.
PATH_SEPARATOR = {'UNIX'=>'/', 'WINDOWS'=>'\\', 'MACINTOSH'=>':'}
"PARTIAL_CONTENT" => "206 Partial Content",
"MULTIPLE_CHOICES" => "300 Multiple Choices",
"MOVED" => "301 Moved Permanently",
"REDIRECT" => "302 Found",
"NOT_MODIFIED" => "304 Not Modified",
"BAD_REQUEST" => "400 Bad Request",
"AUTH_REQUIRED" => "401 Authorization Required",
"FORBIDDEN" => "403 Forbidden",
"NOT_FOUND" => "404 Not Found",
"METHOD_NOT_ALLOWED" => "405 Method Not Allowed",
"NOT_ACCEPTABLE" => "406 Not Acceptable",
"LENGTH_REQUIRED" => "411 Length Required",
"PRECONDITION_FAILED" => "412 Precondition Failed",
"SERVER_ERROR" => "500 Internal Server Error",
"NOT_IMPLEMENTED" => "501 Method Not Implemented",
"BAD_GATEWAY" => "502 Bad Gateway",
"VARIANT_ALSO_VARIES" => "506 Variant Also Negotiates"
# Abbreviated day-of-week names specified by RFC 822
RFC822_DAYS = %w[ Sun Mon Tue Wed Thu Fri Sat ]
# Abbreviated month names specified by RFC 822
RFC822_MONTHS = %w[ Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ]
private :env_table, :stdinput, :stdoutput
# url_encoded_string = CGI::escape("'Stop!' said Fred")
# # => "%27Stop%21%27+said+Fred"
string.gsub(/([^ a-zA-Z0-9_.-]+)/n) do
'%' + $1.unpack('H2' * $1.size).join('%').upcase
# string = CGI::unescape("%27Stop%21%27+said+Fred")
# # => "'Stop!' said Fred"
def CGI::unescape(string)
string.tr('+', ' ').gsub(/((?:%[0-9a-fA-F]{2})+)/n) do
[$1.delete('%')].pack('H*')
# Escape special characters in HTML, namely &\"<>
# CGI::escapeHTML('Usage: foo "bar" <baz>')
# # => "Usage: foo "bar" <baz>"
def CGI::escapeHTML(string)
string.gsub(/&/n, '&').gsub(/\"/n, '"').gsub(/>/n, '>').gsub(/</n, '<')
# Unescape a string that has been HTML-escaped
# CGI::unescapeHTML("Usage: foo "bar" <baz>")
# # => "Usage: foo \"bar\" <baz>"
def CGI::unescapeHTML(string)
string.gsub(/&(amp|quot|gt|lt|\#[0-9]+|\#x[0-9A-Fa-f]+);/n) do
when /\A#0*(\d+)\z/n then
if Integer($1) < 65536 and ($KCODE[0] == ?u or $KCODE[0] == ?U)
when /\A#x([0-9a-f]+)\z/ni then
if $1.hex < 65536 and ($KCODE[0] == ?u or $KCODE[0] == ?U)
# Escape only the tags of certain HTML elements in +string+.
# Takes an element or elements or array of elements. Each element
# is specified by the name of the element, without angle brackets.
# This matches both the start and the end tag of that element.
# The attribute list of the open tag will also be escaped (for
# instance, the double-quotes surrounding attribute values).
# print CGI::escapeElement('<BR><A HREF="url"></A>', "A", "IMG")
# # "<BR><A HREF="url"></A>"
# print CGI::escapeElement('<BR><A HREF="url"></A>', ["A", "IMG"])
# # "<BR><A HREF="url"></A>"
def CGI::escapeElement(string, *elements)
elements = elements[0] if elements[0].kind_of?(Array)
string.gsub(/<\/?(?:#{elements.join("|")})(?!\w)(?:.|\n)*?>/ni) do
# Undo escaping such as that done by CGI::escapeElement()
# print CGI::unescapeElement(
# CGI::escapeHTML('<BR><A HREF="url"></A>'), "A", "IMG")
# # "<BR><A HREF="url"></A>"
# print CGI::unescapeElement(
# CGI::escapeHTML('<BR><A HREF="url"></A>'), ["A", "IMG"])
# # "<BR><A HREF="url"></A>"
def CGI::unescapeElement(string, *elements)
elements = elements[0] if elements[0].kind_of?(Array)
string.gsub(/<\/?(?:#{elements.join("|")})(?!\w)(?:.|\n)*?>/ni) do
# Format a +Time+ object as a String using the format specified by RFC 1123.
# CGI::rfc1123_date(Time.now)
# # Sat, 01 Jan 2000 00:00:00 GMT
def CGI::rfc1123_date(time)
return format("%s, %.2d %s %.4d %.2d:%.2d:%.2d GMT",
RFC822_DAYS[t.wday], t.day, RFC822_MONTHS[t.month-1], t.year,
# Create an HTTP header block as a string.
# Includes the empty line that ends the header block.
# +options+ can be a string specifying the Content-Type (defaults
# to text/html), or a hash of header key/value pairs. The following
# header keys are recognized:
# type:: the Content-Type header. Defaults to "text/html"
# charset:: the charset of the body, appended to the Content-Type header.
# nph:: a boolean value. If true, prepend protocol string and status code, and
# date; and sets default values for "server" and "connection" if not
# status:: the HTTP status code, returned as the Status header. See the
# list of available status codes below.
# server:: the server software, returned as the Server header.
# connection:: the connection type, returned as the Connection header (for
# length:: the length of the content that will be sent, returned as the
# language:: the language of the content, returned as the Content-Language
# expires:: the time on which the current content expires, as a +Time+
# object, returned as the Expires header.
# cookie:: a cookie or cookies, returned as one or more Set-Cookie headers.
# The value can be the literal string of the cookie; a CGI::Cookie
# object; an Array of literal cookie strings or Cookie objects; or a
# hash all of whose values are literal cookie strings or Cookie objects.
# These cookies are in addition to the cookies held in the
# Other header lines can also be set; they are appended as key: value.
# # Content-Type: text/html
# # Content-Type: text/plain