Information about Query String

In the World Wide Web, a query string is the part of a URL that contains data to be passed to web applications such as CGI programs.

The Mozilla URL location bar showing an URL with the query string title=Main_page&action=raw


When a web page is requested via the Hypertext Transfer Protocol, the server locates a file in its file system based on the requested URL. This file may be a regular file or a program. In the second case, the server may (depending on its configuration) run the program, sending its output as the required page. The query string is a part of the URL which is passed to the program. Its use permits data to be passed from the HTTP client (often a browser) to the program which generates the web page.

Structure

A typical URL containing a query string is as follows:

http://server/path/program?query_string


When a server receives a request for such a page, it runs a program (if configured to do so), passing the query_string unchanged to the program. The question mark is used as a separator and is not part of the query string.

A link in a web page may have a URL that contains a query string. However, the main use of query strings is to contain the content of an HTML form, also known as web form. In particular, when a form containing the fields field1, field2, field3 is submitted, the content of the fields is encoded as a query string as follows:

field1=value1&field2=value2&field3=value3...
  • The query string is composed of a series of field-value pairs.
  • The field-value pairs are each separated by an equal sign.
  • The series of pairs is separated by the ampersand, '&'.
For each field of the form, the query string contains a pair field=value. Web forms may include fields that are not visible to the user; these fields are included in the query string when the form is submitted.

This 'name then equal sign then value then ampersand' convention is a W3C recommendation[1]. They also provide a further appendix entry[2] that recommends the use of a semicolon instead of an ampersand.

Technically, the form content is only encoded as a query string when the form submission method is GET. The same encoding is used by default when the submission method is POST, but the result is not sent as a query string, that is, is not added to the action URL of the form. Rather, the string is sent as the body of the request.

URL encoding

Some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character # is used to locate a point within a page; the character = is used to separate a name from a value. A query string may need to be converted to satisfy these constraints. This can be done using a schema known as URL encoding.

In particular, encoding the query string uses the following rules:
  • [a-zA-Z0-9] | '.' | '-' | '~' | '_' are left as-is
  • SPACE is encoded as '+'
  • All other characters are encoded as %FF hex representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding)
The encoding of SPACE as '+' and the selection of "as-is" characters distinguishes this encoding from RFC 1738.

RFC

As defined in RFC 1738, an URL of scheme http can contain a searchpart following the rest of the URL and separated from it by a ? character. RFC 3986 specifies that the query component of an URI is the part between the ? and the end of the URI or the character #. The term query string is of common usage for referring to this part for the case of HTTP URLs.

Example

If a form is embedded in an HTML page as follows:


and the user inserts the strings “this is a field” and “was it clear (already)?” in the two text fields and presses the submit button, the program test.cgi will receive the following query string: first=this+is+a+field&second=was+it+clear+%28already%29%3F

In some UNIX-based web servers, the program receives the query string as an environment variable named QUERY_STRING

Tracking

A program receiving a query string can ignore part or all of it. If the requested URL corresponds to a file and not to a program, the whole query string is ignored. However, regardless of whether the query string is used or not, the whole URL including it is stored in the server log files.

These facts allow query strings to be used to track users in a manner similar to that provided by HTTP cookies. For this to work, every time the user downloads a page, a unique identifier is chosen and added as a query string to the URLs of all links the page contains. As soon as the user follows one of these links, the corresponding URL is requested to the server. This way, the download of this page is linked with the previous one.

For example, when a web page containing the following is requested: see my page! mine is better

a unique string, such as sdfsd23423 is chosen, and the page is modified as follows: see my page! mine is better

The addition of the query string does not change the way the page is shown to the user. When the user follows, for example, the first link, the browser requests the page frank.html?sdfsd23423 to the server, which ignores what follows ? and sends the page frank.html as expected, adding the query string to its links as well.

This way, any subsequent page request from this user will carry the same query string sdfsd23423, making it possible to establish that all these pages have been viewed by the same user. Query strings are often used in association with web beacons.

The main differences between query strings used for tracking and HTTP cookies are that:
  1. Query strings form part of the URL, and are therefore included if the user saves or sends the URL to another user; cookies can be maintained across browsing sessions, but are not saved or sent with the URL.
  2. If the user arrives at the same web server by two (or more) independent paths, it will be assigned two different query strings, while the stored cookies are the same.

Flexibility vs. Security

A URL query string allows for flexibility in retrieving data from a web server and possibly from the database used to populate pages for that web server. A read only data store, such as a weather mapping service, is one example where URL query strings can be used with great flexibility.

In some circumstances, a URL query string may expose security issues because it can be edited by a user to retrieve data that they do not have access to. In particular, a URL query string containing a username and password could be used to guess at valid login credentials to a particular web site.

See also

External links

  • RFC 1738
  • RFC 3986
World Wide Web (commonly shortened to the Web) is a system of interlinked, hypertext documents accessed via the Internet. With a web browser, a user views web pages that may contain text, images, videos, and other multimedia and navigates between them using hyperlinks.
..... Click the link for more information.
Uniform Resource Locator (URL) formerly known as Universal Resource Locator, is a technical, Web-related term used in two distinct meanings:
  • In popular usage, many technical documents, it is a synonym for Uniform Resource Identifier (URI);

..... Click the link for more information.
The Common Gateway Interface (CGI) is a standard protocol for interfacing external application software with an information server, commonly a web server. This allows the server to pass requests from a client web browser to the external application.
..... Click the link for more information.
A Web page or webpage is a resource of information that is suitable for the World Wide Web and can be accessed through a web browser. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext links.
..... Click the link for more information.
Hypertext Transfer Protocol (HTTP) is a communications protocol used to transfer or convey information on the World Wide Web. Its original purpose was to provide a way to publish and retrieve HTML hypertext pages.
..... Click the link for more information.
file system (often also written as filesystem) is a method for storing and organizing computer files and the data they contain to make it easy to find and access them.
..... Click the link for more information.
Uniform Resource Locator (URL) formerly known as Universal Resource Locator, is a technical, Web-related term used in two distinct meanings:
  • In popular usage, many technical documents, it is a synonym for Uniform Resource Identifier (URI);

..... Click the link for more information.
A webform on a web page allows a user to enter data that is, typically, sent to a server for processing and to mimic the usage of paper forms. Forms can be used to submit data to save on a server (e.g., ordering a product) or can be used to retrieve data (e.g.
..... Click the link for more information.
The equal sign, equals sign, or "=" is a mathematical symbol used to indicate equality. It was invented in 1557 by Welshman Robert Recorde.
..... Click the link for more information.
For the magazine, see Ampersand magazine.
An ampersand (&), also commonly called an "and sign" is a logogram representing the conjunction "and." The symbol is a ligature of the letters in et, Latin for "and.
..... Click the link for more information.
In computer science, data that has several parts can be divided into fields. For example, a computer may represent today's date as three distinct fields: the day, the month and the year.
..... Click the link for more information.
The equal sign, equals sign, or "=" is a mathematical symbol used to indicate equality. It was invented in 1557 by Welshman Robert Recorde.
..... Click the link for more information.
For the magazine, see Ampersand magazine.
An ampersand (&), also commonly called an "and sign" is a logogram representing the conjunction "and." The symbol is a ligature of the letters in et, Latin for "and.
..... Click the link for more information.
World Wide Web Consortium

Consortium
Founded October 1994
Founder Tim Berners-Lee
Headquarters MIT/CSAIL in USA
ERCIM in France
Keio University in Japan
and many other offices around the world

Website www.w3.
..... Click the link for more information.
A semicolon (  ;  ) is a punctuation mark. The Italian printer Aldus Manutius the Elder established the practice of using the mark to separate words opposed in meaning and to mark off interdependent statements.
..... Click the link for more information.
For the magazine, see Ampersand magazine.
An ampersand (&), also commonly called an "and sign" is a logogram representing the conjunction "and." The symbol is a ligature of the letters in et, Latin for "and.
..... Click the link for more information.
Uniform Resource Identifier (URI), is a compact string of characters used to identify or name a resource. The main purpose of this identification is to enable interaction with representations of the resource over a network, typically the World Wide Web, using specific
..... Click the link for more information.
HTML (Hypertext Markup Language)

File extension: .html, .htm
MIME type: text/html
Type code: TEXT
..... Click the link for more information.
A text box, text field or text entry box is a common element of graphical user interface of computer programs, as well as the corresponding type of widget used when programming GUIs.
..... Click the link for more information.
Unix (officially trademarked as UNIX®) is a computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs including Ken Thompson, Dennis Ritchie and Douglas McIlroy.
..... Click the link for more information.
web server can mean one of two things:
  1. A computer program that is responsible for accepting HTTP requests from clients, which are known as web browsers, and serving them HTTP responses along with optional data contents, which usually are web pages such as HTML documents and

..... Click the link for more information.
Environment variables are a set of dynamic values that can affect the way running processes will behave on a computer.

Synopsis

In all Unix and Unix-like systems, each process has its own private set of environment variables.
..... Click the link for more information.
Data logging is the practice of recording sequential data, often chronologically.

Etymology

To log is a verbed derivative of the noun logbook; the verb form means to record in a logbook, and may have been coined in the 1820s.
..... Click the link for more information.
HTTP cookies, sometimes known as web cookies or just cookies, are parcels of text sent by a server to a web browser and then sent back unchanged by the browser each time it accesses that server.
..... Click the link for more information.
A Web bug is an object that is embedded in a web page or e-mail and is usually invisible to the user but allows checking that a user has viewed the page or e-mail. One common use is in e-mail tracking.
..... Click the link for more information.
Hypertext Transfer Protocol (HTTP) is a communications protocol used to transfer or convey information on the World Wide Web. Its original purpose was to provide a way to publish and retrieve HTML hypertext pages.
..... Click the link for more information.
The Common Gateway Interface (CGI) is a standard protocol for interfacing external application software with an information server, commonly a web server. This allows the server to pass requests from a client web browser to the external application.
..... Click the link for more information.
HTTP cookies, sometimes known as web cookies or just cookies, are parcels of text sent by a server to a web browser and then sent back unchanged by the browser each time it accesses that server.
..... Click the link for more information.
A Web bug is an object that is embedded in a web page or e-mail and is usually invisible to the user but allows checking that a user has viewed the page or e-mail. One common use is in e-mail tracking.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter